How Do Neural Networks Really Work in the Deep Learning?

Published in

Analytics Vidhya

9 min readApr 4, 2021

Introduction to Deep Learning :

As a subset of artificial intelligence, deep learning lies at the heart of various innovations: self-driving cars, natural language processing, image recognition and so on.

Deep learning is one of the subsets of machine learning that uses deep learning algorithms to implicitly come up with important conclusions based on input data.

Usually, deep learning is unsupervised or semi-supervised. Deep learning is based on representation learning. Instead of using task-specific algorithms, it learns from representative examples. For example, if you want to build a model that recognizes cats by species, you need to prepare a database that includes a lot of different cat images.

The main architectures of deep learning are:

Convolutional neural networks
Recurrent neural networks
Generative adversarial networks
Recursive neural networks

We are going to talk about them more in detail later in this text.

Difference between machine learning and deep learning

Machine learning attempts to extract new knowledge from a large set of pre-processed data loaded into the system. Programmers need to formulate the rules for the machine, and it learns based on them. Sometimes, a human might intervene to correct its errors.

However, deep learning is a bit different:

What are artificial neural networks?

“Artificial neural networks” and “deep learning” are often used interchangeably, which isn’t really correct. Not all neural networks are “deep”, meaning “with many hidden layers”, and not all deep learning architectures are neural networks. There are also deep belief networks, for example.

However, since neural networks are the most hyped algorithms right now and are, in fact, very useful for solving complex tasks, we are going to talk about them in this post.

Definition of an ANN

An artificial neural network represents the structure of a human brain modeled on the computer. It consists of neurons and synapses organized into layers.

ANN can have millions of neurons connected into one system, which makes it extremely successful at analyzing and even memorizing various information.

Here is a video for those who want to dive deeper into the technical details of how artificial neural networks work.

Components of Neural Networks

There are different types of neural networks but they always consist of the same components: neurons, synapses, weights, biases, and functions.

Neurons

A neuron or a node is a basic unit of neural networks that receives information, performs simple calculations, and passes it further.

All neurons in a net are divided into three groups:

Input neurons that receive information from the outside world;
Hidden neurons that process that information;
Output neurons that produce a conclusion.

In a large neural network with many neurons and connections between them, neurons are organized in layers. There is an input layer that receives information, a number of hidden layers, and the output layer that provides valuable results. Every neuron performs transformation on the input information.

Neurons only operate numbers in the range [0,1] or [-1,1]. In order to turn data into something that a neuron can work with, we need normalization.

Wait, but how do neurons communicate? Through synapses.

Synapses and weights

A synapse is what connects the neurons like an electricity cable. Every synapse has a weight. The weights also add to the changes in the input information. The results of the neuron with the greater weight will be dominant in the next neuron, while information from less ‘weighty’ neurons will not be passed over. One can say that the matrix of weights governs the whole neural system.

How do you know which neuron has the biggest weight? During the initialization (first launch of the NN), the weights are randomly assigned but then you will have to optimize them.

Bias

A bias neuron allows for more variations of weights to be stored. Biases add richer representation of the input space to the model’s weights.

In the case of neural networks, a bias neuron is added to every layer. It plays a vital role by making it possible to move the activation function to the left or right on the graph.

It is true that ANNs can work without bias neurons. However, they are almost always added and counted as an indispensable part of the overall model.

How ANNs work

Every neuron processes input data to extract a feature. Let’s imagine that we have three features and three neurons, each of which is connected with all these features.

Each of the neurons has its own weights that are used to weight the features. During the training of the network, you need to select such weights for each of the neurons that the output provided by the whole network would be true-to-life.

To perform transformations and get an output, every neuron has an activation function. This combination of functions performs a transformation that is described by a common function F — this describes the formula behind the NN’s magic.

There are a lot of activation functions. The most common ones are linear, sigmoid, and hyperbolic tangent. Their main difference is the range of values they work with.

How do you train an algorithm?

Neural networks are trained like any other algorithm. You want to get some results and provide information to the network to learn from. For example, we want our neural network to distinguish between photos of cats and dogs and provide plenty of examples.

Delta is the difference between the data and the output of the neural network. We use calculus magic and repeatedly optimize the weights of the network until the delta is zero. Once the delta is zero or close to it, our model is correctly able to predict our example data.

Iteration

This is a kind of counter that increases every time the neural network goes through one training set. In other words, this is the total number of training sets completed by the neural network.

Epoch

The epoch increases each time we go through the entire set of training sets. The more epochs there are, the better is the training of the model.

Batch

Batch size is equal to the number of training examples in one forward/backward pass. The higher the batch size, the more memory space you’ll need.

What is the difference between an iteration and an epoch?

one epoch is one forward pass and one backward pass of all the training examples;
number of iterations is a number of passes, each pass using [batch size] number of examples. To be clear, one pass equals one forward pass + one backward pass (we do not count the forward pass and backward pass as two different passes).

And what about errors?

Error is a deviation that reflects the discrepancy between expected and received output. The error should become smaller after every epoch. If this does not happen, then you are doing something wrong.

The error can be calculated in different ways, but we will consider only two main ways: Arctan and Mean Squared Error.

There is no restriction on which one to use and you are free to choose whichever method gives you the best results. But each method counts errors in different ways:

With Arctan, the error will almost always be larger.

arctan2(i1−a1)+…+arctan2(in−an)narctan2(i1−a1)+…+arctan2(in−an)n

MSE is more balanced and is used more often.

(i1−a1)2+(i2−a2)2+…+(in−an)2n(i1−a1)2+(i2−a2)2+…+(in−an)2n

What kinds of neural networks exist?

There are so many different neural networks out there that it is simply impossible to mention them all. If you want to learn more about this variety, visit the neural network zoo where you can see them all represented graphically.

Feed-forward neural networks

This is the simplest neural network algorithm. A feed-forward network doesn’t have any memory. That is, there is no going back in a feed-forward network. In many tasks, this approach is not very applicable. For example, when we work with text, the words form a certain sequence, and we want the machine to understand it.

Feedforward neural networks can be applied in supervised learning when the data that you work with is not sequential or time-dependent. You can also use it if you don’t know how the output should be structured but want to build a relatively fast and easy NN.

Recurrent neural networks

A recurrent neural network can process texts, videos, or sets of images and become more precise every time because it remembers the results of the previous iteration and can use that information to make better decisions.

Recurrent neural networks are widely used in natural language processing and speech recognition.

Convolutional neural networks

Convolutional neural networks are the standard of today’s deep machine learning and are used to solve the majority of problems. Convolutional neural networks can be either feed-forward or recurrent.

Let’s see how they work. Imagine we have an image of Albert Einstein. We can assign a neuron to all pixels in the input image.

But there is a big problem here: if you connect each neuron to all pixels, then, firstly, you will get a lot of weights. Hence, it will be a very computationally intensive operation and take a very long time. Then, there will be so many weights that this method will be very unstable to overfitting. It will predict everything well on the training example but work badly on other images.

Therefore, programmers came up with a different architecture where each of the neurons is connected only to a small square in the image. All these neurons will have the same weights, and this design is called image convolution. We can say that we have transformed the picture, walked through it with a filter simplifying the process. Fewer weights, faster to count, less prone to overfitting.

For an awesome explanation of how convolutional neural networks work, watch this video by Luis Serrano.

Generative adversarial neural networks

A generative adversarial network is an unsupervised machine learning algorithm that is a combination of two neural networks, one of which (network G) generates patterns and the other (network A) tries to distinguish genuine samples from the fake ones. Since networks have opposite goals — to create samples and reject samples — they start an antagonistic game that turns out to be quite effective.

GANs are used, for example, to generate photographs that are perceived by the human eye as natural images or deepfakes (videos where real people say and do things they have never done in real life).

What kind of problems do NNs solve?

Neural networks are used to solve complex problems that require analytical calculations similar to those of the human brain. The most common uses for neural networks are:

Classification. NNs label the data into classes by implicitly analyzing its parameters. For example, a neural network can analyse the parameters of a bank client such as age, solvency, credit history and decide whether to loan them money.
Prediction. The algorithm has the ability to make predictions. For example, it can foresee the rise or fall of a stock based on the situation in the stock market.
Recognition. This is currently the widest application of neural networks. For example, a security system can use face recognition to only let authorized people into the building.

Summary

Deep learning and neural networks are useful technologies that expand human intelligence and skills. Neural networks are just one type of deep learning architecture. However, they have become widely known because NNs can effectively solve a huge variety of tasks and cope with them better than other algorithms.

If you want to learn more about machine learning, continue reading my blogs:

Audio Data Augmentation: https://vijay-anandan.medium.com/lets-augment-a-audio-data-part-1-5ab5f6a87bae
Sentiment Analysis On Voice Data: https://vijay-anandan.medium.com/sentiment-analysis-of-voice-data-64533a952617
Resample an extremely imbalanced datasets: https://vijay-anandan.medium.com/how-to-resample-an-imbalanced-datasets-8e413dabbc21

linkedin : https://www.linkedin.com/in/vijay-anadan/