Inspired from brain, Artificial Neural Network is a tool use by Deep Learning. We can say Artificial Neural Network is recreation of brain into machine. But the question is how can we recreate that in a machine?
Neuron is the basic building block of artificial neural network, neuron has quite an interesting structure. A body, and a lot of different tails, kind of branches coming out of them. The main purpose of the deep learning is to mimic how the human brain works. So, we are going to create an amazing infrastructure for machines to be able to learn. Our first challenge is to creating a artificial neural networks, is to recreate a neuron.
Let’s have a look at actual neuron :
Neurons are cells within the nervous system that transmit information to other nerve cells, muscle, or gland cells. Cell body is the main part of the neuron, then its got some branches at the top, which are called dendrites and it also got an axon. So, what are these dendrites and the axon for. Dendrites are like the receivers of the signal for the neuron, and axon is the transmitter of the signal for the neuron. One neuron is connected to the other neuron through synapse, Synapses are the contact points where one neuron communicates with another.
Now, let’s see how we are going to create neurons in machines and how it works.
So, here’s our neuron, also sometimes called the node. The neuron gets some input signal and it has an output signal. Here the blue neuron is getting signal from green neuron through synapses. Where, green neurons are the input layer and these inputs are independent variables. these independent variables are all for one single observation. So, think of it as one row in your data base. The yellow neuron is a output value. The blue layer of neuron is a hidden layer, the hidden layer is hidden between input and output since the output of one layer is the input of another layer. Hidden layer can be multiple and its job is to transform the input into something meaningful which can be use by output layer. Next thing you need to know is the synapses, they are actually assigned weights. Weights are crucial to artificial neural networks functioning. Because weights are how neural networks learn. By adjusting the weights, the neural network decides in every single case, what signal is important and what signal is not important to certain neuron. When you are training your artificial neural network, you are basically adjusting all of the weights in all of the synapses across whole neural network. This is similar thing we do in Linear Regression, i.e improving the cost by using gradient descent.
What happens inside the neuron?
A few things happen. First thing, all the values in input layer and hidden layers are get multiply by their respective weight and then it take summation of them and applies it to an activation function. As shown below :
Several activation function are there. We’ll see three :
Threshold Activation Function
Threshold function is a yes/no type of function.
F(x) = 1 if x >= 0 ; F(x) = 0 if x < 0
Sigmoid Activation function
It’s a function which is used in Logistic Regression. The good thing about this function is that it is smooth than the threshold function with gradual progression. Sigmoid function is very useful in the final layer, in the output layer. Especially when you are trying to predict probabilities.
Rectifier Activation Function
It is the most popular function for artificial neural networks. It will output the input directly if is positive, otherwise, it will output zero. It is also called as ReLU activation function
Hyperbolic Tangent Activation Function
Hyperbolic Tangent function is also like sigmoid function but better. The range of this function is from (-1 to 1). Unlike sigmoid function having range from (0 to 1).
Training the Artificial Neural Network
Consider, all the input nodes as data points in first row of your database. We have already discussed that, in neuron all input values are first get multiplied by the weights and after summing up it goes to activation function and applied to the output, this process is called Forward Propagation. Now let’s see the training with Backward Propagation. First randomly assign weights to small number close to zero. Now next step is forward propagation (from left to right); Propagate the activation until getting the predicted result y(output). Then, compare the predicted result to the actual result and measure the generated error. After calculating error, next step to back propagate (from right to left); the error is back propagated and according to how much they are responsible for the error update the weights. The learning rate decides by how much we update the weights. Then again, repeat each step and again update the weights after each observation. When the whole training set passed through the artificial neural network, that makes an epoch. Redo more epochs in order to reduce the cost or error. This is how artificial neural network train.
Cost function is use to reduce error. It measures the performance of your Machine Learning model. Cost function is a difference between predicted output and the actual output. Cost function can be improve by methods like Gradient Descent. Lesser the cost better the model.