The Restricted Boltzmann Machine technique, used for feature selection and feature extraction, is crucial in the era of Machine Learning and Deep Learning for dimensionality reduction, classification, regression, and many other tasks. In this article, we will discuss this technique, its features, working, and training.
But before diving into the same, first, let us understand what Boltzmann Machines are.
What Are Boltzmann Machines?
A scientist at the University of Toronto named Geoffrey Hinton created the Boltzmann Machine for the first time in 1985. Some refer to him as the "Godfather of Deep Learning," and he is a prominent member of the deep learning community.
The Boltzmann Machine is a generative unsupervised model that relies on the learning of a probability distribution from a unique dataset and the use of that distribution to draw conclusions about unexplored data. The Boltzmann Machine has one or more hidden layers in addition to the input layer, also known as the visible layer or the hidden layer.
The Boltzmann Machine employs neural networks with neurons coupled to both neurons in the same layer and neurons in different layers. Every aspect of the universe is interconnected. The connections are two-way, with both the visible and buried neurons connecting to one another. The Boltzmann The machine generates data; it does not wait for input. Neurons produce information whether or not they are visible.
All neurons are treated equally by the Boltzmann machine, which does not distinguish between visible and hidden neurons. The entirety of objects, as well as the system's generating state, constitute the Boltzmann machine.
The Boltzmann Machine’s sampling distribution uses the Boltzmann Distribution. The following equation controls the Boltzmann distribution -
Pi = e(-∈i/kT)/ ∑e(-∈j/kT)
k - Boltzmann's constant
∈i - the energy of the system in state i
Pi - the probability of the system being in state i
∑e(-∈j/kT) - The sum of values for all possible states of the system
T - the temperature of the system
The Boltzmann Distribution defines many system states, and Boltzmann machines use this distribution to generate various machine states. According to the equation above, the likelihood that a system will be in state I diminish as system energy rises. In its lowest energy state, the system is, therefore, the most stable (gas is the most stable when it spreads). In this case, synapses' weights are used to define the system's energy in Boltzmann's machines. The system always strives to find the lowest energy state by modifying the weights after it has been trained and the weights have been established.
Let us now understand the types of Boltzmann Machines.
Types of Boltzmann Machines
The various kinds of Boltzmann Machines are:
- Deep Belief Networks (DBNs)
- Restricted Boltzmann Machines (RBMs)
- Deep Boltzmann Machines (DBMs)
We will be discussing each of these types of Boltzmann Machines in brief.
Restricted Boltzmann Machines (RBMs)
A restricted term means that we are not permitted to connect two types of layers that are of the same type to one another. In other words, the two hidden layers or input layers of neurons are unable to form connections with one another. However, there may be connections between the apparent and hidden layers.
Since there is no output layer in our machine, it is unclear how we will detect, modify the weights, and determine whether or not our prediction was right. One response fits all the questions: Restricted Boltzmann Machine.
Deep Belief Networks (DBNs)
Consider stacking numerous RBMs so that the outputs of the first RBM serve as the input for the second RBM, and so forth. Deep Belief Networks are the name given to these networks. Each layer's connections are undirected (as each layer is an RBM). Those between the strata are simultaneously directed (except for the top two layers – whose connections are undirected). The DBNs can be trained in two different ways:
- Greedy Layer-wise Training Algorithm: RBMs are trained using a greedy layer-by-layer training algorithm. The orientation between the DBN layers is established as soon as the individual RBMs have been trained (i.e., the parameters, weights, and biases, have been defined).
- Wake-sleep Algorithm: The DBN is trained from the bottom up using a wake-sleep algorithm (connections going up indicate wake), and then from the bottom up using connections indicating sleep.
In order to ensure that the layer connections only work downwards, we stack the RBMs, train them, and then do so (except for the top two layers).
Deep Boltzmann Machines (DBMs)
Similar to DBNs, DBMs also have undirected connections between the layers in addition to the connections inside the levels (unlike DBN, in which the layer connections are directed). DBMs can be utilized for more challenging jobs since they can extract more sophisticated or complex features.
Now, we will be learning about RBMs in depth.
What Are RBMs?
Boltzmann machines include connections between visible and hidden nodes, but RBMs don't. This is how RBMs vary from them. Boltzmann machines and RBMs are identical in every other respect.
The neural network that is a part of the energy-based model is called RBM. It is a generative, unsupervised, probabilistic deep learning algorithm. Finding the joint probability distribution that maximizes the log-likelihood function is the goal of RBM. RBM only has two layers: the input layer and the hidden layer, and it is undirected. All of the hidden nodes are linked to all of the visible nodes. RBM is also referred to as an asymmetrical bipartite graph since it has two layers: a visible or input layer and a hidden layer. The visible nodes don't have any connections inside the same layer. The concealed nodes are not connected intralayer either. Only the input and hiding nodes have connections.
All of the nodes in the original Boltzmann machine are connected. RBM is referred to as a Restricted Boltzmann Machine since it restricts intralayer connectivity.
RBMs do not modify their weights through backpropagation and gradient descent since they are undirected. They change their weights using a technique known as contrastive divergence. The visible nodes’ weights are initially created at random and utilized to create the hidden nodes. Then, these concealed nodes recreate exposed nodes using the same weights. All throughout, the same weights were utilized to reconstruct the visible nodes. Due to their lack of connectivity, the created nodes are different from one another.
We will now go through the features of the Restricted Boltzmann Machine.
Features of Restricted Boltzmann Machine
Some key characteristics of the Boltzmann machine are:
- There are no connections between the layers.
- They employ symmetric and recurring structures.
- It is an algorithm for unsupervised learning, meaning that it draws conclusions from the input data without labeled replies.
- In their learning process, RBMs attempt to link low energy states with high probability ones and vice versa.
Let us now look at the working of a Restricted Boltzmann Machine.
Working of RBM
A low-level feature from a learning target item in the dataset is used by each visible node. The hidden layer's node 1 multiplies x by weight and adds it to a bias. These two procedures' outcomes are fed into an activation function, which, given an input of x, creates the output of the node, or the signal strength traveling through it.
Let's now examine how many inputs would mix at a single hidden node. The output of the node is created by multiplying each x by a distinct weight, summing the products, adding the sum to a bias, and then passing the final result once again via an activation function.
Each input x is multiplied by its corresponding weight w at each buried node. In other words, a single input x would have three weights in this situation, totaling 12 weights (4 input nodes x 3 hidden nodes). The weights between the two layers will always create a matrix with input nodes in the rows and output nodes in the columns.
The four inputs are sent to each hidden node, multiplied by each weight. Each hidden node receives one output as a result of the activation algorithm after the sum of these products is once more added to a bias (which compels at least some activations to occur).
Now that you have a basic understanding of how the Restricted Boltzmann Machine operates, let's move and examine the procedures for RBM training.
Training of RBM
Two methods - Gibbs Sampling and Contrastive Divergence are used to train RBM.
When direct sampling is challenging, the Markov chain Monte Carlo method known as Gibbs sampling is used to get a series of observations that are roughly drawn from a given multivariate probability distribution.
The prediction is the hidden value by h and p(h|v) if the input is represented by v. P(v|h) is utilized for the regenerated input values’ prediction when the hidden values are known. Let's say that after k rounds, v k is acquired from input value v 0 after this process has been performed k times.
In order to approximate the graph slope, a graphical slope showing the relationship between a network's errors and its weights is called the gradient in Contrastive Divergence. Contrastive Divergence is a rough Maximum-Likelihood learning approach and is employed when we need to approximate the learning gradient of the algorithm and choose which direction to go in because we cannot directly evaluate a set of probabilities or a function.
Weights are updated on CD. The gradient is first determined from the reconstructed input, and the old weights are updated by adding the delta.
Let us now explore the various advantages and disadvantages of RBM.
Advantages and Disadvantages of RBM
Some of the Advantages of RBM Are:
- The hidden layer's activations can be included in other models as valuable features to boost performance.
- Due to the limitations on connections between nodes, it is faster than a standard Boltzmann machine.
- Efficiently computed and expressive enough to encode any distribution.
Some of the Disadvantages of RBM Are:
- The backpropagation algorithm is more well known than the CD-k algorithm, which is utilized in RBMs.
- Because it is challenging to calculate the Energy gradient function, training is more challenging.
- Weight Modification.
We will now explore the applications of RBM.
Applications of RBM
Radar Intra-Pulse Real-Time Detection
We can implement Radar intra-pulse real-time detection using RBM properties of the intrapulse Radar extraction has great importance, but it faces difficulties such as limited data representation capabilities and noise-tolerance resilience.
Radar signals’ Ambiguity function (AF) is used as an intra-pulse characteristic to improve recognition performance. Numerous dimension reduction techniques have been used to extract important information from AF. However, older algorithms typically process a lot of data and quickly slow down as the number of sampling points increases, which is incongruent with the demands of current intelligence.
In order to extract features properly, the restricted Boltzmann machine (RBM), a stochastic neural network, is used here. Prior to using the singular value decomposition (SVD) approach for noise reduction in low SNR to the main ridge region of the AF, we first calculate the radar signals’ AF. Finally, trained RBM is fed with the processed data to obtain the recognition outcomes.
Handwritten Digit Recognition
Modern applications include check verification, criminal evidence, data entry, and office computerization using handwritten digit recognition, which is a fairly prevalent challenge nowadays. Additionally, it has drawbacks, including inconsistent writing styles, size and shape inconsistencies, and picture noise that alters the topology of the numerals. For digit recognition in this, a hybrid RBM-CNN algorithm is applied. First, RBM deep learning techniques are used to extract the features. The CNN deep learning system is then fed the features that were retrieved for categorization. The ability to extract the input data features is a strong suit of RBMs. By introducing hidden units in an unsupervised manner, it is created in a way that it can extract the discriminative characteristics from huge and complex datasets.
Our Learners Also Ask
1. What are restricted Boltzmann machines used for?
The weights present on the connections for a search problem, for example, can be fixed and are used to represent the cost function of the optimization problem. Boltzmann machines are commonly employed to solve various computing issues.
2. Are restricted Boltzmann machines still used?
Currently, RBMs are not commonly used; instead, deep feed-forward networks with layers like convolutional layers, fully connected layers, and throwing in some kind of regularization layers, like dropout, as well as recent batch-normalization with activation layers in between, typically ReLU, but sigmoid and tanh are also used as well as possibly some max-poolings are used.
3. What is the difference between Boltzmann and restricted Boltzmann machines?
There are two levels in each algorithm, one apparent and one concealed. Each neuron in the visible layer of the Boltzmann Machine is linked to every neuron in the hidden layer, and all of the layers' individual neurons are linked as well. RBM, on the other hand, is a special case of the Boltzmann machine with the restriction that neurons within the layer are not connected, i.e., there is no intra-layer communication, which makes them independent and simpler to implement as conditional independence that so we would need to calculate only marginal probability, which is simpler to compute.
4. What do you understand about the Restricted Boltzmann Machine (RBM)?
A stochastic artificial neural network that is generative in nature and can learn a probability distribution across its set of inputs is called a Restricted Boltzmann machine (RBM).
5. What are the two layers of the restricted Boltzmann machine?
The input layer, or the visible layer, is the first layer of the RBM, and the hidden layer is the second.
Stay ahead of the tech-game with our AI and ML Course in partnership with Purdue and in collaboration with IBM. Explore more!
Become an AI-ML Expert With Simplilearn
In this article, we discussed Boltzmann machines and their types in brief and learned about Restricted Boltzmann Machines (RBMs) a little in-depth, along with their features, working, training, advantages, disadvantages, and applications.
To learn these concepts more in-depth and become a machine-learning expert, do explore Simplilearn’s AI and ML Certification today!