Detailed explanation of classification evaluation indicators and regression evaluation indicators and Python code implementation-Python Tutorial-php.cn

Detailed explanation of classification evaluation indicators and regression evaluation indicators and Python code implementation

零到壹度

Apr 16, 2018 am 11:11 AM

index

The content of this article is a detailed explanation of classification evaluation indicators and regression evaluation indicators as well as Python code implementation. It has certain reference value. Now I share it with you. Friends in need can refer to it.

1. Concept

Performance measurement (evaluation) indicators are mainly divided into two categories:
1) Classification evaluation indicators (classification), mainly analysis, discrete, Integer. Its specific indicators include accuracy (accuracy), precision (precision), recall (recall), F value, P-R curve, ROC curve and AUC.
2) Regression evaluation index (regression), mainly analyzes the relationship between integers and real numbers. Its specific indicators include explained variance score (explianed_variance_score), mean absolute error MAE (mean_absolute_error), mean square error MSE (mean-squared_error), root mean square difference RMSE, cross entropy loss (Log loss, cross-entropy loss), R Square value (coefficient of determination, r2_score).

1.1. Premise

Assume that there are only two categories - positive and negative. Usually the category of concern is the positive category and other categories are the negative category ( Therefore, multiple types of problems can also be summarized into two categories)
The confusion matrix is as follows

##PositiveTP FNP(actually positive)##negative##AB pattern in the table: No. One represents whether the prediction result is right or wrong, and the second represents the category of the prediction. For example, TP means True Positive, that is, the correct prediction is the positive class; FN means False Negative, that is, the wrong prediction is the negative class.

Actual category	Predicted category
	Positive	Negative	Summary

FP	TN	N (actually negative)

2. Evaluation indicators (performance measurement)

2.1. Classification evaluation indicators

2.1.1 Value indicators-Accuracy, Precision, Recall, F value

##MeasurementAccuracyThe ratio of the number of correctly classified samples to the total number of samples (the proportion of real spam messages predicted to be spam messages)accuracy=

Precision

Recall

F value

Definition

The ratio of the number of true positive cases among the positive cases to the number of positive cases (the proportion of all real spam text messages that are classified and correctly found)

The number of correct cases judged to be positive and Ratio of the total number of positive examples

Harmonic average F

_-score means

##recall=

F

_{## - score =}

1.Precision is also often called precision rate, and recall is called recall rate
2. The more commonly used one is F1,

##python3.6 code implementation:

#调用sklearn库中的指标求解from sklearn import metricsfrom sklearn.metrics import precision_recall_curvefrom sklearn.metrics import average_precision_scorefrom sklearn.metrics import accuracy_score#给出分类结果y_pred = [0, 1, 0, 0]
y_true = [0, 1, 1, 1]
print("accuracy_score:", accuracy_score(y_true, y_pred))
print("precision_score:", metrics.precision_score(y_true, y_pred))
print("recall_score:", metrics.recall_score(y_true, y_pred))
print("f1_score:", metrics.f1_score(y_true, y_pred))
print("f0.5_score:", metrics.fbeta_score(y_true, y_pred, beta=0.5))
print("f2_score:", metrics.fbeta_score(y_true, y_pred, beta=2.0))

2.1.2 Related curve-P-R curve, ROC curve and AUC value

1) P-R curve

Steps:
1. Change the "score" value from high to low Sort and use them as thresholds in turn;
2. For each threshold, test samples with a "score" value greater than or equal to this threshold are considered positive examples, and others are negative examples. Thus forming a set of forecast numbers.
eg.

Detailed explanation of classification evaluation indicators and regression evaluation indicators and Python code implementation Set 0.9 as the threshold, then the first test sample is a positive example, and 2, 3, 4, and 5 are negative examples
Get

Predicted positive examplePredicted negative exampleTotal Positive example (score is greater than the threshold) 0.90.11## Negative example (score is less than the threshold)precision=##The part below the threshold is treated as a negative example, and the value of the predicted negative example is the correct predicted value, that is, if it is a positive example, TP is taken; if it is a negative example, , then take TN, which are all prediction scores.



0.2 0.3 0.3 0.35 = 1.15	0.8 0.7 0.7 0.65 = 2.85	4
$recall=$

Python implementation of pseudo code

#precision和recall的求法如上
#主要介绍一下python画图的库
import matplotlib.pyplot ad plt
#主要用于矩阵运算的库
import numpy as np#导入iris数据及训练见前一博文
...
#加入800个噪声特征，增加图像的复杂度
#将150*800的噪声特征矩阵与150*4的鸢尾花数据集列合并
X = np.c_[X, np.random.RandomState(0).randn(n_samples, 200*n_features)]
#计算precision，recall得到数组
for i in range(n_classes):
    #计算三类鸢尾花的评价指标， _作为临时的名称使用
    precision[i], recall[i], _ = precision_recall_curve(y_test[:, i], y_score[:,i])#plot作图plt.clf()
for i in range(n_classes):
    plt.plot(recall[i], precision[i])
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel("Recall")
plt.ylabel("Precision")
plt.show()

After completing the above code, the P-R curve of the iris data set is obtained

2) ROC curve
Horizontal axis: False positive example Rate fp rate = FP / N Detailed explanation of classification evaluation indicators and regression evaluation indicators and Python code implementation Vertical axis: True case rate tp rate = TP / N
Steps:
1. Sort the "score" values from high to low and use them as thresholds in sequence;
2. For each threshold, the test samples whose "score" value is greater than or equal to this threshold are considered positive examples, and the others are negative examples. Thus forming a set of forecast numbers.

It is similar to the P-R curve calculation, so I won’t go into details

The ROC image of the iris data set is

AUC (Area Under Curve) is defined as the area under the ROC curve
The AUC value provides an overall numerical value for the classifier. Usually, the larger the AUC, the better the classifier, and the value is [0, 1]

2.2. Regression evaluation index

1) Explainable variance score

2) Mean absolute error MAE (Mean absolute error)
Detailed explanation of classification evaluation indicators and regression evaluation indicators and Python code implementation
3) Mean squared error MSE (Mean squared error)

4) Logistics regression loss

5) Consistency evaluation-pearson correlation coefficient method

python code implementation

from sklearn.metrics import log_loss
log_loss(y_true, y_pred)from scipy.stats import pearsonr
pearsonr(rater1, rater2)from sklearn.metrics import cohen_kappa_score
cohen_kappa_score(rater1, rater2)

The above is the detailed content of Detailed explanation of classification evaluation indicators and regression evaluation indicators and Python code implementation. For more information, please follow other related articles on the PHP Chinese website!

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

详细讲解Python之Seaborn（数据可视化）Apr 21, 2022 pm 06:08 PM

本篇文章给大家带来了关于Python的相关知识，其中主要介绍了关于Seaborn的相关问题，包括了数据可视化处理的散点图、折线图、条形图等等内容，下面一起来看一下，希望对大家有帮助。

详细了解Python进程池与进程锁May 10, 2022 pm 06:11 PM

本篇文章给大家带来了关于Python的相关知识，其中主要介绍了关于进程池与进程锁的相关问题，包括进程池的创建模块，进程池函数等等内容，下面一起来看一下，希望对大家有帮助。

国产开源MoE指标炸裂：GPT-4级别能力，API价格仅百分之一May 07, 2024 pm 05:34 PM

最新国产开源MoE大模型，刚刚亮相就火了。DeepSeek-V2性能达GPT-4级别，但开源、可免费商用、API价格仅为GPT-4-Turbo的百分之一。因此一经发布，立马引发不小讨论。图片通过公布的性能指标来看，DeepSeekV2的中文综合能力超越一众开源模型，同时GPT-4Turbo、文快4.0等闭源模型同处第一梯队。英文综合能力也和LLaMA3-70B同处第一梯队，并且超过了同是MoE的Mixtral8x22B。在知识、数学、推理、编程等方面也表现出不错性能。并支持128K上下文。图片这