Dispersion measures
Dispersion measures attempt, through the calculation of different formulas, to yield a numerical value that provides information on the degree of variability of a variable.
In other words, measures of dispersion are numbers that indicate whether one variable moves a lot, a little, more or less than another. The raison d’être of this type of measures is to know in a summarised way a characteristic of the variable under study. In this sense, they should accompany measures of central tendency. Together, they provide information at a glance that we can then use to compare and, if necessary, make decisions.
Main measures of dispersion
The best known measures of dispersion are: the range, the variance, the standard deviation and the coefficient of variation (not to be confused with the coefficient of determination). These four measures are discussed below.
Range
The range is a numerical value that indicates the difference between the maximum and minimum value of a population or statistical sample. Its formula is:
R = Maxx – Minx
Where:
- R → Is the range.
- Max → The maximum value of the sample or population.
- Min → The minimum value of the sample or statistical population.
- x → The variable for which this measure is intended to be calculated.
Variance
Variance is a measure of dispersion that represents the variability of a data series with respect to its mean. Formally, it is calculated as the sum of the squared residuals divided by the total observations. Its formula is as follows:
- X → Variable for which the variance is to be calculated.
- xi → Observation number i of variable X. i can take values between 1 and n.
- N → Number of observations.
- x̄ → The mean of the variable X.
Standard deviation
The standard deviation is another measure that provides information on the dispersion with respect to the mean. Its calculation is exactly the same as the variance, but by taking the square root of its result. In other words, the standard deviation is the square root of the variance.
- X → Variable for which the variance is to be calculated.
- xi → Observation number i of variable X. i can take values between 1 and n.
- N → Number of observations.
- x̄ → is the mean of the variable X.
Coefficient of variation
Its calculation is obtained by dividing the standard deviation by the absolute value of the mean of the set and is usually expressed as a percentage for better understanding.
- X → Variable for which the variance is to be calculated.
- σx → Standard deviation of the variable X.
- | x̄ | → Is the mean of the variable X in absolute value with x̄ ≠ 0
An image summarising the above formulas is shown below:
For comparative purposes, it is important to note that we should always compare variables with the same units of measurement. For example, it would not make much sense to say that the variability of gross domestic product (GDP) is greater than that of ice cream sales. By proxy, it can be stated, but comparing euros with the number of ice creams does not make sense. Therefore, it is always better to compare variables with the same unit of measurement.
The same applies to measures of dispersion. If you want to compare two variables, it is preferable to do so with the same measures of dispersion for each of them and preferably in the same unit.