What is standard deviation in python?


What is Standard Deviation?

Standard deviation is a number that describes how spread out the values are.

A low standard deviation means that most of the numbers are close to the mean (average) value.

A high standard deviation means that the values are spread out over a wider range.

Example: This time we have registered the speed of 7 cars:

speed = [86,87,88,86,87,85,86]

The standard deviation is:

0.9

Meaning that most of the values are within the range of 0.9 from the mean value, which is 86.4.

Let us do the same with a selection of numbers with a wider range:

speed = [32,111,138,28,59,77,97]

The standard deviation is:

37.85

Meaning that most of the values are within the range of 37.85 from the mean value, which is 77.4.

As you can see, a higher standard deviation indicates that the values are spread out over a wider range.

The NumPy module has a method to calculate the standard deviation:

Example

Use the NumPy std() method to find the standard deviation:

import numpy

speed = [86,87,88,86,87,85,86]

x = numpy.std(speed)

print(x)

Try it Yourself »

Example

import numpy

speed = [32,111,138,28,59,77,97]

x = numpy.std(speed)

print(x)

Try it Yourself »



Variance

Variance is another number that indicates how spread out the values are.

In fact, if you take the square root of the variance, you get the standard deviation!

Or the other way around, if you multiply the standard deviation by itself, you get the variance!

To calculate the variance you have to do as follows:

1. Find the mean:

(32+111+138+28+59+77+97) / 7 = 77.4

2. For each value: find the difference from the mean:

 32 - 77.4 = -45.4
111 - 77.4 =  33.6
138 - 77.4 =  60.6
 28 - 77.4 = -49.4
 59 - 77.4 = -18.4
 77 - 77.4 = - 0.4
 97 - 77.4 =  19.6

3. For each difference: find the square value:

(-45.4)2 = 2061.16
 (33.6)2 = 1128.96
 (60.6)2 = 3672.36
(-49.4)2 = 2440.36
(-18.4)2 =  338.56
(- 0.4)2 =    0.16
 (19.6)2 =  384.16

4. The variance is the average number of these squared differences:

(2061.16+1128.96+3672.36+2440.36+338.56+0.16+384.16) / 7 = 1432.2

Luckily, NumPy has a method to calculate the variance:

Example

Use the NumPy var() method to find the variance:

import numpy

speed = [32,111,138,28,59,77,97]

x = numpy.var(speed)

print(x)

Try it Yourself »


Standard Deviation

As we have learned, the formula to find the standard deviation is the square root of the variance:

1432.25 = 37.85

Or, as in the example from before, use the NumPy to calculate the standard deviation:

Example

Use the NumPy std() method to find the standard deviation:

import numpy

speed = [32,111,138,28,59,77,97]

x = numpy.std(speed)

print(x)

Try it Yourself »


Symbols

Standard Deviation is often represented by the symbol Sigma: σ

Variance is often represented by the symbol Sigma Square: σ2


Chapter Summary

The Standard Deviation and Variance are terms that are often used in Machine Learning, so it is important to understand how to get them, and the concept behind them.



Statistics module in Python provides a function known as stdev() , which can be used to calculate the standard deviation. stdev() function only calculates standard deviation from a sample of data, rather than an entire population. 

To calculate standard deviation of an entire population, another function known as pstdev() is used. 

Standard Deviation is a measure of spread in Statistics. It is used to quantify the measure of spread, variation of a set of data values. It is very much similar to variance, gives the measure of deviation whereas variance provides the squared value. 
A low measure of Standard Deviation indicates that the data are less spread out, whereas a high value of Standard Deviation shows that the data in a set are spread apart from their mean average values. A useful property of the standard deviation is that, unlike the variance, it is expressed in the same units as the data. 

Standard Deviation is calculated by :

What is standard deviation in python?
where x1, x2, x3.....xn are observed values in sample data,
What is standard deviation in python?
is the mean value of observations and N is the number of sample observations.

Syntax : stdev( [data-set], xbar )
Parameters : 
[data] : An iterable with real valued numbers. 
xbar (Optional): Takes actual mean of data-set as value.
Returnype : Returns the actual standard deviation of the values passed as parameter.
Exceptions : 
StatisticsError is raised for data-set less than 2 values passed as parameter. 
Impossible/precision-less values when the value provided as xbar doesn’t match actual mean of the data-set. 
 

Code #1 :  

Python3

import statistics

sample = [1, 2, 3, 4, 5]

print("Standard Deviation of sample is % s "

                % (statistics.stdev(sample)))

Output : 

Standard Deviation of the sample is 1.5811388300841898 

Code #2 : Demonstrate stdev() on a varying set of data types  

Python3

from statistics import stdev

from fractions import Fraction as fr

sample1 = (1, 2, 5, 4, 8, 9, 12)

sample2 = (-2, -4, -3, -1, -5, -6)

sample3 = (-9, -1, -0, 2, 1, 3, 4, 19)

sample4 = (1.23, 1.45, 2.1, 2.2, 1.9)

print("The Standard Deviation of Sample1 is % s"

                              %(stdev(sample1)))

print("The Standard Deviation of Sample2 is % s"

                              %(stdev(sample2)))

print("The Standard Deviation of Sample3 is % s"

                              %(stdev(sample3)))

print("The Standard Deviation of Sample4 is % s"

                              %(stdev(sample4)))

Output : 

The Standard Deviation of Sample1 is 3.9761191895520196
The Standard Deviation of Sample2 is 1.8708286933869707
The Standard Deviation of Sample3 is 7.8182478855559445
The Standard Deviation of Sample4 is 0.41967844833872525

Code #3 :Demonstrate the difference between results of variance() and stdev()  

Python3

import statistics

sample = [1, 2, 3, 4, 5]

print("Standard Deviation of the sample is % s "

                    %(statistics.stdev(sample)))

print("Variance of the sample is % s"

     %(statistics.variance(sample)))

Output : 

Standard Deviation of the sample is 1.5811388300841898 
Variance of the sample is 2.5

Code #4 : Demonstrate the use of xbar parameter  

Python3

import statistics

sample = (1, 1.3, 1.2, 1.9, 2.5, 2.2)

m = statistics.mean(sample)

print("Standard Deviation of Sample set is % s"

         %(statistics.stdev(sample, xbar = m)))

Output : 

Standard Deviation of Sample set is 0.6047037842337906

Code #5 : Demonstrates StatisticsError  

Python3

import statistics

sample = [1]

print(statistics.stdev(sample))

Output : 

Traceback (most recent call last):
  File "/home/f921f9269b061f1cc4e5fc74abf6ce10.py", line 12, in 
    print(statistics.stdev(sample))
  File "/usr/lib/python3.5/statistics.py", line 617, in stdev
    var = variance(data, xbar)
  File "/usr/lib/python3.5/statistics.py", line 555, in variance
    raise StatisticsError('variance requires at least two data points')
statistics.StatisticsError: variance requires at least two data points

Applications :  

  • Standard Deviation is highly essential in the field of statistical maths and statistical study. It is commonly used to measure confidence in statistical calculations. For example, the margin of error in calculating marks of an exam is determined by calculating the expected standard deviation in the results if the same exam were to be conducted multiple times.
  • It is very useful in the field of financial studies as well as it helps to determine the margin of profit and loss. The standard deviation is also important, where the standard deviation on the rate of return on an investment is a measure of the volatility of the investment.

What does the standard of deviation tell you?

A standard deviation (or σ) is a measure of how dispersed the data is in relation to the mean. Low standard deviation means data are clustered around the mean, and high standard deviation indicates data are more spread out.

Is there a standard deviation function in Python?

stdev() method in Python statistics module. Statistics module in Python provides a function known as stdev() , which can be used to calculate the standard deviation. stdev() function only calculates standard deviation from a sample of data, rather than an entire population.

What is standard deviation with an example?

What is the standard deviation example? Consider the data set: 2, 1, 3, 2, 4. The mean and the sum of squares of deviations of the observations from the mean will be 2.4 and 5.2, respectively. Thus, the standard deviation will be √(5.2/5) = 1.01.