Artificial Intelligence(AI) and Machine Learning(ML) is one of the most important areas in the current Technology space. AI is touching the lives of all of us in one way or the other e.g. Search Engine, Conversational AI, Image segmentation. If you are a beginner or just curious about ML and what all the related terminologies mean, this is the post for you. In some cases, it will help the mid-level guys too because we have met people who are into this space but with an ambiguous understanding of multiple terms.
So, let's start learning. We will start with the most Umbrella term i.e. AI
Artificial Intelligence (AI)
Artificial Intelligence is a general idea to attain human-level intelligence using a machine. It is the pursuit of Human to match Human-Level intelligence. We are still very far from the goal.
But be mindful that it has nothing to do with Driverless car Or Image captioning. These may be the contemporary best results of our time but this pursuit is very old.
It goes back to 1950 when Alan Turing devised the Idea of the Turing test.
In the 1940s and 50s, a handful of scientists from a variety of fields (mathematics, psychology, engineering, economics and political science) began to discuss the possibility of creating an artificial brain. The field of artificial intelligence research was founded as an academic discipline in 1956.Wikipedia
The initial approach to build logic was based on Rules. It is still a common approach in traditional programming. A limitation of this approach is that it is not automatic i.e. can only be done using the knowledge of Domain experts.
Machine Learning is the approach where data is used to develop the program. We are relying on our Algorithm to learn the underlying pattern of the data and define all the rules(too complex to call it a rule but we may say so for the sake of simplicity)
Check the below depiction, how ML and Traditional programming differs. The second image is stating that ML learns the pattern from one set of data and then it can predict the pattern of future data.
Deep Learning is the recent state-of-the-art (Wikipedia) approach using the deep neural network which enables us to analyse complex data e.g. Images/Text.
Neural Network is a special kind of Algorithm whose initial motivation was based on the Human brains internal design. Definitely its the current best approach but we never know what will be the go-to algorithm in the next 5 years. This is a very active area of research.
In a very simple form, data can have two aspects i.e. the data itself and its labelling e.g.
- Tweets(Data) and its sentiment i.e. Happy, Neutral(Label)
- Image(Data) and the object in the image i.e. Cat/Dog(Label)
- Credit cards transactions((Data) and its Categorization i.e. Normal/Fraudulent(Label)
So, supervised learning is the class of ML learning when we input both the Data and the label to the Model. In this class, we can predict the Class of Image i.e. Cat/Dog or may try to predict the expected Sales of the future.
Unsupervised learning is the other class of ML learning when we only input the Data to the Model. In this class, we can't tell anything specific about the class but we may say that a particular record doesn't look very similar to the rest of the data e.g. Credit card fraud.
But why look beyond Supervised learning
Supervised learning looks very intuitive and definitely it is easier to implement and infer. But getting a large amount of Labelled data is an expensive task because labelling requires human effort. For example, you may get a large amount of image data from the internet but labelling them will need a lot of effort
Unsupervised learning can be used standalone or it can be blended with Supervised Learning to achieve a variety of tasks. Check our blog on this i.e. Anomaly/Novelty.
This is a level ahead of the two. There is a scenario where we can't even have unlabelled data. e.g. How will we generate the data for road scenarios to train a driverless car or how do we get data to train a model to play Chess.
In Reinforcement Learning, a software agent makes observations and takes actions within an environment, and in return it receives rewards. Its objective is to learn to act in a way that will maximize its expected rewards over time.
For example, we may create a dummy chessboard(environment) and the agent(RL software) to play the game. It will be rewarded for winning the game per rule and penalized for losing.
A model is the model of the underlying data. But this term is used very casually in this field.
A model has two states
- When it is untrained, at this point it is just a Mathematical equation without actual value e.g. $$ salary = \alpha*AGE + \beta*EDUCATION $$
- When it is trained with the data, at this point the Mathematical equation learn the underlying values e.g. $$ salary = 1.5*AGE + 5*EDUCATION $$ These values of the Model are called parameters. This was a very simple example, not all models are like this. In fact, this is one of the simplest models.
Bonus - Data Scientist
Data Scientist is a very overly used terminology in this field.
If I have to give you an analogy from Cricket then it will be like an All-rounder who must have decent hands on multiple aspects of cricket e.g. batting, Bowling, Fielding.
Let's see the life-cycle of a data, In the beginning, it needs Data engineering and in the end, it needs ML engineering. Data Scientist should be able to own basic tasks of both of these and in addition complete ownership of Data modelling.
Needless to say, It all depends on the requirement of the Organization. If it's a large project, then it may have a separate associate for all the roles but for a small project, it might expect people to own overlapping roles.
This was all for beginners, in the next section of the post we will extend the story. Moving on this story will bring more technical terms i.e. Parametric models or Non-parametric model. We will also try to figure out how a Model learns the underlying data in a layman's term.