search
HomeBackend DevelopmentPython TutorialSupervised vs. Unsupervised Learning

Supervised vs. Unsupervised Learning

Machine learning (ML) is a powerful tool that enables computers to learn from data and make predictions or decisions. But not all machine learning is the same – there are different types of learning, each suitable for specific tasks. The two most common types are supervised learning and unsupervised learning. In this article, we'll explore the differences between them, provide real-world examples, and walk through code snippets to help you understand how they work.


What is supervised learning?

Supervised learning is a type of machine learning in which an algorithm learns from labeled data. In other words, the data you provide to the model includes input features and the correct outputs (labels). The goal is for the model to learn the relationship between inputs and outputs so that it can make accurate predictions on new, unseen data.

Real world examples of supervised learning

Email Spam Detection:

  • Input: The text of the email.
  • Output: Label indicating whether the email is "Spam" or "Not Spam".
  • The model learns to classify emails based on labeled examples.

House Price Forecast:

  • Input: Characteristics of the home (e.g. square footage, number of bedrooms, location).
  • Output: Price of the house.
  • The model learns to predict prices based on historical data.

Medical Diagnosis:

  • Input: Patient data (e.g., symptoms, lab results).
  • Output: Diagnosis (e.g. "Health" or "Diabetes").
  • The model learns to diagnose based on labeled medical records.

What is unsupervised learning?

Unsupervised learning is a type of machine learning in which algorithms learn from unlabeled data. Unlike supervised learning, no correct output is provided. Instead, models try to find patterns, structures, or relationships in the data on their own.

Real world examples of unsupervised learning

Customer segmentation:

  • Input: Customer data (e.g. age, purchase history, location).
  • Output: Groups of similar customers (e.g., "high-frequency buyers", "budget shoppers").
  • The model identifies clusters of customers with similar behavior.

Anomaly detection:

  • Input: network traffic data.
  • Output: Identify unusual patterns that may indicate a cyber attack.
  • The model detects outliers or anomalies in the data.

Market Basket Analysis:

  • Input: Grocery store transaction data.
  • Output: Groups of products that are often purchased together (e.g., "bread and butter").
  • The model identifies associations between products.

The main differences between supervised learning and unsupervised learning

**方面** **监督学习** **无监督学习**
**数据** 标记的(提供输入和输出) 未标记的(仅提供输入)
**目标** 预测结果或对数据进行分类 发现数据中的模式或结构
**示例** 分类、回归 聚类、降维
**复杂性** 更容易评估(已知输出) 更难评估(没有基本事实)
**用例** 垃圾邮件检测、价格预测 客户细分、异常检测
---

Code Example

Let’s dig into some code and see how supervised and unsupervised learning work in practice. We will use Python and the popular Scikit-learn library.

Supervised Learning Example: Predicting House Prices

We will use a simple linear regression model to predict the price of a home based on characteristics such as square footage.

# 导入库
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

# 创建样本数据集
data = {
    'SquareFootage': [1400, 1600, 1700, 1875, 1100, 1550, 2350, 2450, 1425, 1700],
    'Price': [245000, 312000, 279000, 308000, 199000, 219000, 405000, 324000, 319000, 255000]
}
df = pd.DataFrame(data)

# 特征 (X) 和标签 (y)
X = df[['SquareFootage']]
y = df['Price']

# 将数据分成训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# 训练线性回归模型
model = LinearRegression()
model.fit(X_train, y_train)

# 做出预测
y_pred = model.predict(X_test)

# 评估模型
mse = mean_squared_error(y_test, y_pred)
print(f"均方误差:{mse:.2f}")

Unsupervised Learning Example: Customer Segmentation

We will use K-means clustering algorithm to group customers based on their age and spending habits.

# 导入库
import numpy as np
import pandas as pd
from sklearn.cluster import KMeans
import matplotlib.pyplot as plt

# 创建样本数据集
data = {
    'Age': [25, 34, 22, 45, 32, 38, 41, 29, 35, 27],
    'SpendingScore': [30, 85, 20, 90, 50, 75, 80, 40, 60, 55]
}
df = pd.DataFrame(data)

# 特征 (X)
X = df[['Age', 'SpendingScore']]

# 训练 K 均值聚类模型
kmeans = KMeans(n_clusters=3, random_state=42)
df['Cluster'] = kmeans.fit_predict(X)

# 可视化集群
plt.scatter(df['Age'], df['SpendingScore'], c=df['Cluster'], cmap='viridis')
plt.xlabel('年龄')
plt.ylabel('消费评分')
plt.title('客户细分')
plt.show()

When to use supervised learning vs. unsupervised learning

When to use supervised learning:

  • You have labeled data.
  • You want to predict outcomes or classify data.
  • Examples: Predicting sales, classifying images, detecting fraud.

When to use unsupervised learning:

  • You have unlabeled data.
  • You want to discover hidden patterns or structures.
  • Examples: Group customers, reduce data dimensions, and find anomalies.

Conclusion

Supervised learning and unsupervised learning are two basic methods in machine learning, each with its own advantages and use cases. Supervised learning is great for making predictions when you have labeled data, while unsupervised learning is great when you want to explore and discover patterns in unlabeled data.

By understanding the differences and practicing with real-world examples, such as the ones in this article, you will master these basic machine learning techniques. If you have any questions or want to share your own experiences, please feel free to leave a comment below.

The above is the detailed content of Supervised vs. Unsupervised Learning. For more information, please follow other related articles on the PHP Chinese website!

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
The 2-Hour Python Plan: A Realistic ApproachThe 2-Hour Python Plan: A Realistic ApproachApr 11, 2025 am 12:04 AM

You can learn basic programming concepts and skills of Python within 2 hours. 1. Learn variables and data types, 2. Master control flow (conditional statements and loops), 3. Understand the definition and use of functions, 4. Quickly get started with Python programming through simple examples and code snippets.

Python: Exploring Its Primary ApplicationsPython: Exploring Its Primary ApplicationsApr 10, 2025 am 09:41 AM

Python is widely used in the fields of web development, data science, machine learning, automation and scripting. 1) In web development, Django and Flask frameworks simplify the development process. 2) In the fields of data science and machine learning, NumPy, Pandas, Scikit-learn and TensorFlow libraries provide strong support. 3) In terms of automation and scripting, Python is suitable for tasks such as automated testing and system management.

How Much Python Can You Learn in 2 Hours?How Much Python Can You Learn in 2 Hours?Apr 09, 2025 pm 04:33 PM

You can learn the basics of Python within two hours. 1. Learn variables and data types, 2. Master control structures such as if statements and loops, 3. Understand the definition and use of functions. These will help you start writing simple Python programs.

How to teach computer novice programming basics in project and problem-driven methods within 10 hours?How to teach computer novice programming basics in project and problem-driven methods within 10 hours?Apr 02, 2025 am 07:18 AM

How to teach computer novice programming basics within 10 hours? If you only have 10 hours to teach computer novice some programming knowledge, what would you choose to teach...

How to avoid being detected by the browser when using Fiddler Everywhere for man-in-the-middle reading?How to avoid being detected by the browser when using Fiddler Everywhere for man-in-the-middle reading?Apr 02, 2025 am 07:15 AM

How to avoid being detected when using FiddlerEverywhere for man-in-the-middle readings When you use FiddlerEverywhere...

What should I do if the '__builtin__' module is not found when loading the Pickle file in Python 3.6?What should I do if the '__builtin__' module is not found when loading the Pickle file in Python 3.6?Apr 02, 2025 am 07:12 AM

Error loading Pickle file in Python 3.6 environment: ModuleNotFoundError:Nomodulenamed...

How to improve the accuracy of jieba word segmentation in scenic spot comment analysis?How to improve the accuracy of jieba word segmentation in scenic spot comment analysis?Apr 02, 2025 am 07:09 AM

How to solve the problem of Jieba word segmentation in scenic spot comment analysis? When we are conducting scenic spot comments and analysis, we often use the jieba word segmentation tool to process the text...

How to use regular expression to match the first closed tag and stop?How to use regular expression to match the first closed tag and stop?Apr 02, 2025 am 07:06 AM

How to use regular expression to match the first closed tag and stop? When dealing with HTML or other markup languages, regular expressions are often required to...

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
WWE 2K25: How To Unlock Everything In MyRise
3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

ZendStudio 13.5.1 Mac

ZendStudio 13.5.1 Mac

Powerful PHP integrated development environment

Atom editor mac version download

Atom editor mac version download

The most popular open source editor

Safe Exam Browser

Safe Exam Browser

Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

SublimeText3 Linux new version

SublimeText3 Linux new version

SublimeText3 Linux latest version

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use