Home  >  Article  >  Backend Development  >  Python program to calculate standard deviation

Python program to calculate standard deviation

WBOY
WBOYforward
2023-09-06 11:33:062529browse

Python program to calculate standard deviation

In this article, we will learn how to implement a Python program to calculate the standard deviation of a data set.

Consider a set of values ​​plotted on an arbitrary axis. The standard deviation of these sets of values ​​is called the population and is defined as the variation between them. If the standard deviation is low, the plotted values ​​will be closer to the mean. But if the standard deviation is higher, the values ​​will be further away from the mean.

It is represented by the square root of the variance of the data set. There are two types of standard deviation -

The population standard deviation is calculated from each data value of the population. Therefore, it is a fixed value. The mathematical formula is defined as follows -

$$\mathrm{SD\:=\:\sqrt{\frac{\sum(X_i\:-\:X_m)^2}{n}}}$$

Where,

(Where)
  • Xm is the mean of the data set.

  • Xi are elements of the dataset.

  • n is the number of elements in the dataset.

However, Sample standard deviation is a statistic calculated only for certain data values ​​of a population, so its value depends on the sample chosen. The mathematical formula is defined as follows −

$$\mathrm{SD\:=\:\sqrt{\frac{\sum(X_i\:-\:X_m)^2}{n\:-\:1}}}$$

Where,

(Where)
  • Xm is the mean of the data set.

  • Xi are elements of the dataset.

  • n is the number of elements in the dataset.

Input and output scenarios

Now let’s look at some input and output scenarios for different data sets -

Assume that the data set contains only positive integers -

Input: [2, 3, 4, 1, 2, 5]
Result: Population Standard Deviation: 1.3437096247164249
Sample Standard Deviation: 0.8975274678557505

Assume that the data set contains only negative integers -

Input: [-2, -3, -4, -1, -2, -5]
Result: Population Standard Deviation: 1.3437096247164249
Sample Standard Deviation: 0.8975274678557505

Assume that the data set contains only positive and negative integers -

Input: [-2, -3, -4, 1, 2, 5]
Result: Population Standard Deviation: 3.131382371342656
Sample Standard Deviation: 2.967415635794143

Use mathematical formulas

We have already seen the formula for standard deviation in the same article; now let us look at implementing the mathematical formula on various data sets using a Python program.

Example

In the following example, we import the math library and calculate the standard deviation of a data set and its variance by applying the sqrt() built-in function .

import math

#declare the dataset list
dataset = [2, 3, 4, 1, 2, 5]

#find the mean of dataset
sm=0
for i in range(len(dataset)):
   sm+=dataset[i]
   mean = sm/len(dataset)

#calculating population standard deviation of the dataset
deviation_sum = 0
for i in range(len(dataset)):
   deviation_sum+=(dataset[i]- mean)**2
   psd = math.sqrt((deviation_sum)/len(dataset))

#calculating sample standard deviation of the dataset
ssd = math.sqrt((deviation_sum)/len(dataset) - 1)

#display output
print("Population standard deviation of the dataset is", psd)
print("Sample standard deviation of the dataset is", ssd)

Output

The obtained output standard deviation is as follows -

Population Standard Deviation of the dataset is 1.3437096247164249
Sample standard deviation of the dataset is 0.8975274678557505

Using std()Function

in the numpy module

In this approach, we import the numpy module and calculate the overall standard of the elements of a numpy array using only the numpy.std() function Difference.

Example

Implement the following python program to calculate the standard deviation of numpy array elements -

import numpy as np

#declare the dataset list
dataset = np.array([2, 3, 4, 1, 2, 5])

#calculating standard deviation of the dataset
sd = np.std(dataset)

#display output
print("Population standard deviation of the dataset is", sd)

Output

The standard deviation is displayed as the following output -

Population Standard Deviation of the dataset is 1.3437096247164249

Use stdev() and pstdev() functions in the statistics module

The Statistics module in Python provides functions named stdev() and pstdev() to calculate the standard deviation of a sample data set. The stdev() function in Python only calculates the sample standard deviation, while the pstdev() function calculates the population standard deviation.

The parameters and return types of the two functions are the same.

Example 1: Using stdev() function

The Python program that demonstrates the use of the stdev() function to calculate the sample standard deviation of a data set is as follows −

import statistics as st

#declare the dataset list
dataset = [2, 3, 4, 1, 2, 5]

#calculating standard deviation of the dataset
sd = st.stdev(dataset)

#display output
print("Standard Deviation of the dataset is", sd)

Output

The sample standard deviation of the data set obtained as output is as follows -

Standard Deviation of the dataset is 1.4719601443879744

Example 2: Using the pstdev() function

The python program that demonstrates how to use the pstdev() function to find the overall standard deviation of a data set is as follows -

import statistics as st

#declare the dataset list
dataset = [2, 3, 4, 1, 2, 5]

#calculating standard deviation of the dataset
sd = st.pstdev(dataset)

#display output
print("Standard Deviation of the dataset is", sd)

Output

The sample standard deviation of the data set obtained as output is as follows -

Standard Deviation of the dataset is 1.3437096247164249

The above is the detailed content of Python program to calculate standard deviation. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:tutorialspoint.com. If there is any infringement, please contact admin@php.cn delete