Artificial Intelligence (AI) and Machine Learning (ML) are some of the hottest topics right now.
The term “AI” is thrown around casually every day. You hear aspiring developers saying they want to learn AI. You also hear executives saying they want to implement AI in their services. But quite often, many of these people don’t understand what AI is.
Once you’ve read this article, you will understand the basics of AI and ML. More importantly, you will understand how Deep Learning, the most popular type of ML, works.
The first step towards understanding how Deep Learning works is to grasp the differences between important terms.
Artificial Intelligence is the replication of human intelligence in computers.
When AI research first started, researchers were trying to replicate human intelligence for specific tasks — like playing a game.
They introduced a vast number of rules that the computer needed to respect. The computer had a specific list of possible actions, and made decisions based on those rules.
Machine Learning refers to the ability of a machine to learn using large data sets instead of hard coded rules.
ML allows computers to learn by themselves. This type of learning takes advantage of the processing power of modern computers, which can easily process large data sets.
Supervised Learning involves using labelled data sets that have inputs and expected outputs.
When you train an AI using supervised learning, you give it an input and tell it the expected output.
If the output generated by the AI is wrong, it will readjust its calculations. This process is done iteratively over the data set, until the AI makes no more mistakes.
An example of supervised learning is a weather-predicting AI. It learns to predict weather using historical data. That training data has inputs (pressure, humidity, wind speed) and outputs (temperature).
Unsupervised Learning is the task of machine learning using data sets with no specified structure.
When you train an AI using unsupervised learning, you let the AI make logical classifications of the data.
An example of unsupervised learning is a behavior-predicting AI for an e-commerce website. It won’t learn by using a labelled data set of inputs and outputs.
Instead, it will create its own classification of the input data. It will tell you which kind of users are most likely to buy different products.
You’re now prepared to understand what Deep Learning is, and how it works.
Deep Learning is a machine learning method. It allows us to train an AI to predict outputs, given a set of inputs. Both supervised and unsupervised learning can be used to train the AI.
We will learn how deep learning works by building an hypothetical airplane ticket price estimation service. We will train it using a supervised learning method.
We want our airplane ticket price estimator to predict the price using the following inputs (we are excluding return tickets for simplicity):
Like animals, our estimator AI’s brain has neurons. They are represented by circles. These neurons are interconnected.
The neurons are grouped into three different types of layers:
The input layer receives input data. In our case, we have four neurons in the input layer: Origin Airport, Destination Airport, Departure Date, and Airline. The input layer passes the inputs to the first hidden layer.
The hidden layers perform mathematical computations on our inputs. One of the challenges in creating neural networks is deciding the number of hidden layers, as well as the number of neurons for each layer.
The “Deep” in Deep Learning refers to having more than one hidden layer.
The output layer returns the output data. In our case, it gives us the price prediction.
So how does it compute the price prediction?
This is where the magic of Deep Learning begins.
Each connection between neurons is associated with a weight. This weight indicates the importance of the input value. The initial weights are set randomly.
When predicting the price of an airplane ticket, the departure date is one of the heavier factors. Hence, the departure date neuron connections will have a big weight.
Each neuron has an Activation Function. These functions are hard to understand without mathematical reasoning.
Simply put, one of its purposes is to “standardize” the output from the neuron.
Once a set of input data has passed through all the layers of the neural network, it returns the output data through the output layer.
Nothing complicated, right?
Training the AI is the hardest part of Deep Learning. Why?
For our airplane ticket price estimator, we need to find historical data of ticket prices. And due to the large amount of possible airports and departure date combinations, we need a very large list of ticket prices.
To train the AI, we need to give it the inputs from our data set, and compare its outputs with the outputs from the data set. Since the AI is still untrained, its outputs will be wrong.
Once we go through the whole data set, we can create a function that shows us how wrong the AI’s outputs were from the real outputs. This function is called the Cost Function.
Ideally, we want our cost function to be zero. That’s when our AI’s outputs are the same as the data set outputs.
Thus at this juncture, with the help of deep learning we have almost trained the AI to project an output that is in line with the input data. This ladies and gentleman to be honest is just the basics, but deep learning along with precision is used to train the AI for supervised learning.
We hope that at this juncture you have some idea of how the different elements of Machine Learning, Deep Learning and Neuron networks all come together to create Artificial Intelligence.
Stay tuned for more interesting facts on AI and Machine Learning!
[leadsquared-form id=”10463″]