Naive Bayes examples in Python-Python Tutorial-php.cn

Home

Backend Development

Python Tutorial

Naive Bayes examples in Python

王林

Jun 09, 2023 pm 11:36 PM

python programmingNaive BayesExample demonstration

Python is a simple and easy-to-learn programming language with a rich set of scientific computing libraries and data processing tools. Among them, the Naive Bayes algorithm, as a classic machine learning method, is also widely used in the Python language. This article will use examples to introduce the usage and steps of Naive Bayes in Python.

Introduction to Naive Bayes

The Naive Bayes algorithm is a classification algorithm based on Bayes’ theorem. Its core idea is to use known training data The characteristics of the set are used to infer the classification results of new data. In practical applications, the Naive Bayes algorithm is often used in scenarios such as text classification, spam filtering, and sentiment analysis.

The characteristic of the Naive Bayes algorithm is that it assumes that each feature is independent of each other. This assumption is often not true in actual situations, so the Naive Bayes algorithm is called "naive". Despite this assumption, Naive Bayes still performs well on problems such as short text classification.

Using Naive Bayes Classifier

In Python, the steps for using Naive Bayes Classifier can be summarized as follows:

2.1 Prepare data

First you need to prepare the training data and test data to be classified. This data can be in the form of text, pictures, audio, etc., but it needs to be converted into a form that can be understood by the computer. In text classification problems, it is often necessary to convert text into vector representation.

2.2 Training model

Next, you need to use the training data set to build the Naive Bayes classifier. There are three commonly used naive Bayes classifiers in Python:

GaussianNB: suitable for classification of continuous data.
BernoulliNB: Suitable for classification of binary data.
MultinomialNB: Suitable for classification of multivariate data.

Taking text classification as an example, you can use the TfidfVectorizer class provided by the sklearn library to convert the text into a vector representation, and use the MultinomialNB classifier for training.

2.3 Test model

After the training is completed, the test data set needs to be used to evaluate the performance of the model. Typically, the test data set and the training data set are independent. It should be noted that data from the training dataset cannot be used during testing. You can use the accuracy_score function provided by the sklearn library to calculate the accuracy of the model.

Example: Text classification based on Naive Bayes

In order to demonstrate the practical application of the Naive Bayes classifier, this article uses text classification based on Naive Bayes For example.

3.1 Prepare data

First, find two text data sets from the Internet, namely "Sports News" and "Science and Technology News". Each data set contains 1,000 texts. Put the two data sets into different folders and label the texts as "Sports" and "Technology" respectively.

3.2 Use the sklearn library for classification

Next, use the naive Bayes classifier provided by the sklearn library for classification.

(1) Import related libraries

from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import accuracy_score
import os

(2) Read text data and its annotations

def read_files(path):
    text_list = []
    label_list = []
    for root, dirs, files in os.walk(path):
        for file in files:
            file_path = os.path.join(root, file)
            with open(file_path, 'r', encoding='utf-8') as f:
                text = ''.join(f.readlines())
                text_list.append(text)
                if '体育' in file_path:
                    label_list.append('体育')
                elif '科技' in file_path:
                    label_list.append('科技')
    return text_list, label_list

(3) Convert text into vector representation

def text_vectorizer(text_list):
    vectorizer = TfidfVectorizer()
    X = vectorizer.fit_transform(text_list)
    return X, vectorizer

(4) Train the model and return the accuracy

def train(text_list, label_list):
    X, vectorizer = text_vectorizer(text_list)
    y = label_list
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
    clf = MultinomialNB()
    clf.fit(X_train, y_train)
    y_pred = clf.predict(X_test)
    acc = accuracy_score(y_test, y_pred)
    return clf, vectorizer, acc

(5) Test the model

def predict(clf, vectorizer, text):
    X = vectorizer.transform(text)
    y_pred = clf.predict(X)
    return y_pred[0]

3.3 Result analysis

Run the above code to get the accuracy of the classifier is 0.955. When performing actual classification, you only need to input the text to be classified into the predict function to return the category it belongs to. For example, enter the text "iPhone 12 is finally released!" to return to the "Technology" category.

Summary

As a simple and effective classification algorithm, the Naive Bayes algorithm is also widely used in Python. This article introduces the methods and steps of using the Naive Bayes classifier, and takes text classification based on Naive Bayes as an example to demonstrate the practical application of the classifier. In the actual application process, data preprocessing, feature selection and other operations are also required to improve the accuracy of the classifier.

The above is the detailed content of Naive Bayes examples in Python. For more information, please follow other related articles on the PHP Chinese website!

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Learning Python: Is 2 Hours of Daily Study Sufficient?Apr 18, 2025 am 12:22 AM

Is it enough to learn Python for two hours a day? It depends on your goals and learning methods. 1) Develop a clear learning plan, 2) Select appropriate learning resources and methods, 3) Practice and review and consolidate hands-on practice and review and consolidate, and you can gradually master the basic knowledge and advanced functions of Python during this period.

Python for Web Development: Key ApplicationsApr 18, 2025 am 12:20 AM

Key applications of Python in web development include the use of Django and Flask frameworks, API development, data analysis and visualization, machine learning and AI, and performance optimization. 1. Django and Flask framework: Django is suitable for rapid development of complex applications, and Flask is suitable for small or highly customized projects. 2. API development: Use Flask or DjangoRESTFramework to build RESTfulAPI. 3. Data analysis and visualization: Use Python to process data and display it through the web interface. 4. Machine Learning and AI: Python is used to build intelligent web applications. 5. Performance optimization: optimized through asynchronous programming, caching and code

Python vs. C : Exploring Performance and EfficiencyApr 18, 2025 am 12:20 AM

Python is better than C in development efficiency, but C is higher in execution performance. 1. Python's concise syntax and rich libraries improve development efficiency. 2.C's compilation-type characteristics and hardware control improve execution performance. When making a choice, you need to weigh the development speed and execution efficiency based on project needs.

Python in Action: Real-World ExamplesApr 18, 2025 am 12:18 AM

Python's real-world applications include data analytics, web development, artificial intelligence and automation. 1) In data analysis, Python uses Pandas and Matplotlib to process and visualize data. 2) In web development, Django and Flask frameworks simplify the creation of web applications. 3) In the field of artificial intelligence, TensorFlow and PyTorch are used to build and train models. 4) In terms of automation, Python scripts can be used for tasks such as copying files.

Python's Main Uses: A Comprehensive OverviewApr 18, 2025 am 12:18 AM

Python is widely used in data science, web development and automation scripting fields. 1) In data science, Python simplifies data processing and analysis through libraries such as NumPy and Pandas. 2) In web development, the Django and Flask frameworks enable developers to quickly build applications. 3) In automated scripts, Python's simplicity and standard library make it ideal.

The Main Purpose of Python: Flexibility and Ease of UseApr 17, 2025 am 12:14 AM

Python's flexibility is reflected in multi-paradigm support and dynamic type systems, while ease of use comes from a simple syntax and rich standard library. 1. Flexibility: Supports object-oriented, functional and procedural programming, and dynamic type systems improve development efficiency. 2. Ease of use: The grammar is close to natural language, the standard library covers a wide range of functions, and simplifies the development process.

Python: The Power of Versatile ProgrammingApr 17, 2025 am 12:09 AM

Python is highly favored for its simplicity and power, suitable for all needs from beginners to advanced developers. Its versatility is reflected in: 1) Easy to learn and use, simple syntax; 2) Rich libraries and frameworks, such as NumPy, Pandas, etc.; 3) Cross-platform support, which can be run on a variety of operating systems; 4) Suitable for scripting and automation tasks to improve work efficiency.

Learning Python in 2 Hours a Day: A Practical GuideApr 17, 2025 am 12:05 AM

Yes, learn Python in two hours a day. 1. Develop a reasonable study plan, 2. Select the right learning resources, 3. Consolidate the knowledge learned through practice. These steps can help you master Python in a short time.

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)

1 months agoBy尊渡假赌尊渡假赌尊渡假赌

R.E.P.O. Best Graphic Settings

1 months agoBy尊渡假赌尊渡假赌尊渡假赌

Assassin's Creed Shadows: Seashell Riddle Solution

3 weeks agoByDDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

2 weeks agoByDDD

Will R.E.P.O. Have Crossplay?

1 months agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.