In machine learning, supervised learning is used when you already know what the output is, for a given input. So, you already know that the output is Y when the input is X. Given this, the goal of supervised learning is to learn a function that gives you the relationship between X and Y.
Unsupervised learning is used when you do NOT know what the output is, for a given input. So, you do not know what Y is, for a given X input. The goal here is to infer the best relationships and pattern structures in the data.
Supervised learning mainly falls into the following categories:
- Classification, it categorizes inputs into different classes. Examples include:
- Categorizing loan applicants into high, medium, and low-risk borrowers.
- Categorizing emails as spam or not.
- Regression, it outputs numerical data like size, quantity, age etc. Examples include:
- Predicting the age of a person.
- Predicting the price of a house.
Algorithms: Linear regression, Logistic regression, Neural networks etc.
Unsupervised learning mainly falls into the following categories:
- Clustering, it groups inputs based on similarity. Examples include
- Customer segmentation based on location, age, etc.
- Identifying high crime neighborhoods.
- Dimensionality reduction, it removes redundant, unnecessary data from a dataset and keeps parts of data that really matters. It is similar to data compression. Examples include:
- Reducing dimensionality (columns) in computer vision training.
- Reduce datasets containing customer social media engagement with brands from multiple devices.
Algorithms: Hierarchical clustering, k-Means clustering, PCA, SVD etc.