The range, variance and standard deviation are dispersion measures in descriptive statistics. They allow to quantify how much the data of a set are scattered or concentrated around a central value (like the average). Here is a detailed explanation of each:
- Range: max - min
- Definition : The range is the difference between the maximum and minimum value of a data set. It gives a simple idea of the total amplitude of the data.
- Formula : Eˊ tendue=maximum value minimum value text{Extent} = text{maximum value} - text{minimum value}Eˊ tendue=maximum value minimum value
- Example : For data 5, 8, 12, 15, 20: Range=20 5=15
- Utility :
- Fast and intuitive dispersion measurement.
- Useful for comparing sets (e.g. maximum and minimum temperatures over a day).
- Limits :
- Only considers the two extreme values, ignoring the distribution of other data.
- Very sensitive to outliers (outlier values). Example: If we add 100, the range changes to 95, without reflecting the majority of the data.
- Variance
- Definition : Variance measures the spread of data relative to its mean, by calculating the average of squared deviations between each value and the mean. It indicates how far the data move, on average, from their center.
- Formula :
- For a population (all data) : σ2=∑(xi−μ)2N\sigma^2 = \frac{\sum (x_i - \mu)^2}{N}σ2=N∑(xi−μ)2
-
- where xix_ixi is each value, μ muμ is the population mean, and NNN is the total number of values.
- For a sample (subset) : s2=∑(xi−xˉ)2n−1s^2 = \frac{\sum (x_i - \bar{x})^2}{n - 1}s2=n−1∑(xi−xˉ)2
- where xix_ixi is each value, μ muμ is the population mean, and NNN is the total number of values.
-
-
where xˉ bar{x}xˉ is the sample mean and nnn is the number of values.
-
- Example : Data : 2, 4, 6 (mean = 4):
- Gaps : (2-4) = -2, (4-4) = 0, (6-4) = 2.
- Square deviations: 4, 0, 4.
- Variance (population): (4 + 0 + 4) / 3 = 8 / 3 2,67.
- Variance (sample): (4 + 0 + 4) / (3-1) = 8 / 2 = 4.
- Utility :
- Provides a comprehensive measure of dispersion, considering all values.
- Basis for more advanced statistical analyses (e.g. hypothesis testing).
- Limits :
- Expressed in units squared (e.g. m2 for meters), which makes it less intuitive.
- Sensitive to outliers, because the gaps are amplified by the square.
- Standard deviation
- Definition : The standard deviation is the square root of variance. It measures the average dispersion of data compared to the average, in the same units as the original data.
- Formula :
- Pour une population : σ=σ2\sigma = \sqrt{\sigma^2}σ=σ2.
-
-
-
- For one sample : s=s2s = \sqrt{s^2}s=s2.
-
-
- Example : With the previous variance :
- Population : σ=2,67≈1,63\sigma = \sqrt{2,67} \approx 1,63σ=2,67≈1,63.
- Example : With the previous variance :
-
-
- Sample : s=4=2s = \sqrt{4} = 2s=4=2.
-
-
- Utility :
-
- More intuitive than variance, because it is in the same units as the data (e.g. meters, euros).
- Allows understanding the "typical distance" of values from the mean.
- Widely used in normal distributions (ex. 68% of data in 1 SD).
- Limits :
- Like variance, it is sensitive to extreme values.
- Less informative alone for very asymmetric distributions.
-
- Utility :
Comparison
|
Measure |
What it shows |
Advantage |
Disadvantage |
|
Extent |
Total Amplitude |
Simple and fast |
Ignores repartition |
|
Variance |
Mean (squared) dispersion |
Accurate, uses all data |
Unintuitive units |
|
Standard deviation |
Mean dispersion (unit of origin) |
Intuitive, comparable |
Sensitive to outliers |
Practical example
Data : 10, 12, 15, 20, 50 (moyenne = 21,4) :
- Extent : 50 - 10 = 40.
- Variance (sample) :
Gap : -11,4, -9,4, -6,4, -1,4, 28,6 → au carré : 129,96, 88,36, 40,96, 1,96, 817,96 → somme = 1079,2 → s2=1079,2/4=269,8
Standard deviation (sample) : s=269,8≈16,42s = \sqrt{269,8} \approx 16,42s=269,8≈16,42.
Interpretation:
- The extent (40) shows a large amplitude, but says nothing about the distribution.
- The standard deviation (16.42) indicates that the data deviate on average by 16.42 units from the mean, influenced by the 50.
In summary
- Extent : A raw and fast, but limited measurement.
- Variance : A detailed view of scattering, but less convenient to interpret alone.
- Standard deviation : The most common and intuitive measure for assessing variability. These three tools complement each other to give a complete picture of the data spread!