Home  >  Article  >  Backend Development  >  How to implement linear classification using Python Scikit-learn?

How to implement linear classification using Python Scikit-learn?

PHPz
PHPzforward
2023-08-20 18:57:02735browse

Linear classification is one of the simplest machine learning problems. To achieve linear classification, we will use sklearn's SGD (Stochastic Gradient Descent) classifier to predict iris flower varieties.

step

You can implement linear classification using Python Scikit-learn by following the steps given below:

Step 1 − First import the necessary packages scikit-learn, NumPy and matplotlib

Step 2 − Load the data set and build training and test data sets.

Step 3 − Use matplotlib to draw the training instance. Although this step is optional, it is a good practice to demonstrate the example more clearly.

Steps 4 − Create an object of SGD classifier, initialize its parameters and use the fit() method to train the model.

Steps 5 − Use the metric package of the Python Scikit-learn library to evaluate the results.

The translation of

Example

is:

Example

Let's look at the example below, where we will use two characteristics of the iris flower, calyx width and calyx length, to predict the species of the iris flower.

# Import required libraries
import sklearn
import numpy as np
import matplotlib.pyplot as plt
# %matplotlib inline

# Loading Iris flower dataset
from sklearn import datasets
iris = datasets.load_iris()
X_data, y_data = iris.data, iris.target

# Print iris data shape
print ("Original Dataset Shape:",X_data.shape, y_data.shape)

# Dividing dataset into training and testing dataset and standarized the features
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Getting the Iris dataset with only the first two attributes
X, y = X_data[:,:2], y_data

# Split the dataset into a training and a testing set(20 percent)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=1)
print ("\nTesting Dataset Shape:", X_train.shape, y_train.shape)

# Standarize the features
scaler = StandardScaler().fit(X_train)
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)

# Plot the dataset
# Set the figure size
plt.figure(figsize=(7.16, 3.50))
plt.subplots_adjust(bottom=0.05, top=0.9, left=0.05, right=0.95)
plt.title('Training instances', size ='18')
colors = ['orange', 'green', 'cyan']
for i in range(len(colors)):
   px = X_train[:, 0][y_train == i]
   py = X_train[:, 1][y_train == i]
   plt.scatter(px, py, c=colors[i])
   
plt.legend(iris.target_names)
plt.xlabel('Sepal length')
plt.ylabel('Sepal width')
plt.show()

# create the linear model SGDclassifier
from sklearn.linear_model import SGDClassifier
linear_clf = SGDClassifier()

# Train the classifier using fit() function
linear_clf.fit(X_train, y_train)

# Print the learned coeficients
print ("\nThe coefficients of the linear boundary are:", linear_clf.coef_)
print ("\nThe point of intersection of the line are:",linear_clf.intercept_)

# Evaluate the result
from sklearn import metrics
y_train_pred = linear_clf.predict(X_train)
print ("\nThe Accuracy of our classifier is:", metrics.accuracy_score(y_train, y_train_pred)*100)

Output

It will produce the following output

Original Dataset Shape: (150, 4) (150,)

Testing Dataset Shape: (120, 2) (120,)

The coefficients of the linear boundary are: [[-28.85486061 13.42772422]
[ 2.54806641 -5.04803702]
[ 7.03088805 -0.73391906]]

The point of intersection of the line are: [-19.61738307 -3.54055412 -0.35387805]

The accuracy of our classifier is: 76.66666666666667

如何使用Python Scikit-learn实现线性分类?

##

The above is the detailed content of How to implement linear classification using Python Scikit-learn?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:tutorialspoint.com. If there is any infringement, please contact admin@php.cn delete