Tag Archives: unlabeled data

Supervised And Unsupervised Learning

With the advancement in the field of artificial intelligence, we are able to solve the problems of different fields. Some of them you may be using in your daily life. The two major categorization in this field are supervised and unsupervised learning.

You get a bunch of e-mails with information about in which category they fall either “spam” or “not spam” and then you train a model to categorize a new e-mail. This type of learning is called supervised learning.

You are invited to a party and met totally strangers. Now you will classify them using unsupervised learning (no prior knowledge) and this classification can be on the basis of gender, age group, dressing, educational qualification or whatever way you would like. This is unsupervised learning since you are exploring the data and finding groups by exploration.

In supervised learning, we are going to teach the computer how to do something while in unsupervised we let the computer to do itself. Does it make sense? Let’s look into this using some examples.

Supervised Learning

Let’s say we need to predict an image, whether it is “cat” or not.

To make computers learn this type of problem, we need to provide them a dataset having both input image and their corresponding labels i.e. is it a cat or not. So, if the dataset is having output label in it, the problem can be classified as supervised learning problem.

A supervised leaning follow this pattern: input -> hypothesis -> output

Where inputs are our training data for example images of “cat”, hypothesis can be one of the machine learning algorithm for example SVM and Decision Trees and output is corresponding labels for example it is “cat” or “not cat”.

A Supervised learning can be further classified into Classification and Regression.

Classification: In classification problems we predict results in a discrete output. Let say predicting an email as “spam” or “non spam”.

Regression: In regression problems we need to predict results within a continuous output. Let say predicting house prices.

Unsupervised Learning

Let say we are having bunch of T-shirts.

Also we do not have corresponding labels to T-shirts to which class it belongs. Now in unsupervised learning, model will discover information from this data. Let say model discovers a feature as t-shirt sizes and cluster these t-shirts according to their sizes into three categories small, medium and large.

So in unsupervised learning problems output labels are not provided and computer is restricted to find some hidden structure and group data according to that.

Unsupervised learning is further classified into Clustering and association problems.

Clustering: In clustering, algorithm form groups inside the data. For example, grouping news according to its headline as google news does.

Association: In association, algorithm discovers an interesting relationship between data. For example, recommending a similar product to a user on an e-commerce website.

Summary

  • Supervised learning works on labeled training data while unsupervised works on unlabeled training data.
  • Unsupervised learning explores the data and finds interesting features.
  • Supervised learning as the name suggests has a supervisor.
  • Unsupervised learning uses algorithms like K-means, hierarchical clustering while supervised learning uses algorithms like SVM, linear regression, logistic regression, etc.
  • Supervised learning can be applied in the field of risk assessment, image classification, fraud detection, object detection, etc.
  • Unsupervised learning can be applied in the field of delivery store optimization, semantic clustering, market basket analysis, etc.

Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.