Home  >  Article  >  Backend Development  >  Calculation of simple statistics in Python

Calculation of simple statistics in Python

不言
不言forward
2019-01-14 10:21:045500browse

The content of this article is about the calculation of simple statistics in Python. It has certain reference value. Friends in need can refer to it. I hope it will be helpful to you.

1. These operations must ensure that the Anaconda integrated library has been installed on the computer. If an error occurs after installation, you can uninstall python in the original computer and reinstall Anaconda. It is recommended to install it during installation. Directly check Add environment variables, otherwise you will have to add environment variables yourself in the future. In the compiler in Pycharm, select python in the Anaconda installation folder. Create a new data folder in Pycharm to store data files.

Calculation of simple statistics in Python

2. Open the Python Console.

3. First use python to read the data. You need to first enter import pandas as pd to introduce the pandas package, then enter df=pd.read_csv("./data/CityData.csv") to read the data, and finally Enter df to display data.

Calculation of simple statistics in Python

4. Enter type(df) and type(df["cid"]) respectively to find that the two data types are different.

Calculation of simple statistics in Python

Calculation of simple statistics in Python

##5. Calculate the average : df.mean() or df["xid"].mean()

Calculation of simple statistics in Python

6. Calculate the median: Enter df.median( ) or df["yid"].median


Calculation of simple statistics in Python

7. Find the quartiles: enter df .quantile(q=0.25)

Calculation of simple statistics in Python

8. Find the mode: enter df.mode() or df["xid"].mode( )

9. Find the standard deviation: enter df.std() or df["yid"].std()

Calculation of simple statistics in Python

10. Calculate variance: df.var() or df["xid"].var()

Calculation of simple statistics in Python

11. Sum: df. sum() or df["xid"].sum()

Calculation of simple statistics in Python

12. Calculate the skew coefficient: df.skew() or df[ "yid"].skew()

Calculation of simple statistics in Python

13. Calculate kurtosis coefficient: df.kurt() or df["yid"].kurt ()

Calculation of simple statistics in Python

14. Generate a normal distribution function. Pandas cannot generate it directly. You need to introduce scipyimport scipy.stats as ss first, and then enter ss. norm, what is generated at this time is a normal distribution object. We enter ss.norm.stats(moments="mvsk") to check. mvsk represents the mean, variance, skewness coefficient, and kurtosis coefficient respectively.

Calculation of simple statistics in Python

At this time we can see that four values ​​are generated, corresponding to the mvsk of the normal distribution, which are 0, 1, 0, and 0 respectively.

15.ss.norm.pdf(0.0) represents the value of the ordinate when the abscissa is 0. ss.norm.ppf(0.9) means that the value obtained when accumulating from negative infinity to the return value is 0.9, where the value after ppf must be between 0-1. ss.norm.cdf(2) represents the return value when integrating from negative infinity to 2, and ss.norm.rvs(size=10) can obtain 10 random numbers that conform to the normal distribution.

Calculation of simple statistics in Python

16.Similarly, we can input ss.chi2 and ss.t to get the chi-square distribution and T distribution respectively.

Calculation of simple statistics in Python

17. In addition, we can also perform sampling, enter df.sample(n=10) to extract 10 samples from the data, enter df. sample(frac=0.1) takes a 10% sample from the data.

Calculation of simple statistics in Python

The above is the detailed content of Calculation of simple statistics in Python. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:segmentfault.com. If there is any infringement, please contact admin@php.cn delete