History –
Though I do not want to get into the history of how ML came into existence, I would give you a basic gist of its history. Coined by Arthur Samuel in 1959, stating it as “it gives computers the ability to learn without being explicitly programmed”. Most of the study was in the academic era, which recently in last 10-15 years started taking shape.
The reason which I personally think of delayed usage of the ML is the limitation which we had with handling data. As we progressed in our data handling and computing abilities, we started using more ML concepts to reach a point where we are making decisions based on it.
Implementation –
Though there are very complex implementations of ML available, a very impressive simple example of the same would be the usage of the recommender system. This is where we let the machine decide the preferences of a person based of his or her previous browsing history.
Other example which we deal with daily is the google or apple maps, which use a good level of predictive analysis to tell us about traffic conditions based on the traffic on different locations. This is where machine starts learning that a traffic congestion at node B would be impacting traffic at node X.
Classification –
So, as everyone wants to present their papers for PhD, people keep creating different classifications in ML (do not get offended please, they are definitely useful for some particular studies). But majorly there are two types of classifications.
One is Supervised and the other is Unsupervised. This classification is based on the type of response available to the algorithm. You may also get to hear about Reinforcement Learning or Semi-supervised learning. But these are special cases of the above two categories.
1) Supervised –
This is the type of learning where we have the model trained on labelled data set. This means that the data has input and output parameters.
For example, if we are talking about a type of learning, where we have a model which determines if a transaction is risky or not (1 or 0, or maybe low, high, critical) using the input parameters like location of transaction, amount of transaction, type of product purchased. Then this is a supervised learning. This type of learning model which gives output as categories, is known as the Classification learning.
The other would case would be if we have a model where we are trying to determine the speed of wind based on some inputs like air moisture, temperature, we would get the air speed in (miles/hr).
This type of supervised learning is called a Regression learning model. This model again breaks down in many other types of models as per different requirements which I may address in some other article.
So, two basic types of Supervised Learning are Regression and Classification which have a variety of flavors.
Some other Supervised Learning Algorithms would be –
- Linear Regression
- Nearest Neighbor
- Guassian Naive Bayes
- Decision Trees
- Support Vector Machine (SVM)
- Random Forest
2) Unsupervised Learning –
This is a type of model where we do not give our model a target for output. We provide input parameters and model comes out with results. It is machine that decides the way it wants to learn.
For example, if we have a model where we provide it with the customer details, and the model comes up with groups of customers, then we call that type of learning model as Clustering model. This can be used by an eCommerce company where it would provide all data like spending and type of spending and the model would create clusters of customers so that the eCommerce company can target each cluster specifically. This is called as Cluster analysis.
The other case would be when we have a model where provide the model with the observations and it tries to figure out the association between different types of observations. This is called Association analysis and one of its major application is MBA(Market Basket Analysis). I already have an article on this ready, so please look. In this type of model, we provide all the items a person bought in one transaction, and we let the machine create associations between different types of products based on the transactions. For instance, bread and butter would have a good amount of correlation.
Some other unsupervised learning types would be –
- K-Means Clustering
- DBSCAN – Density-Based Spatial Clustering of Applications with Noise
- BIRCH – Balanced Iterative Reducing and Clustering using Hierarchies
- Hierarchical Clustering