Skip to main content icon/video/no-internet

Variance

When describing a distribution of scores, one should use at least three indices: the shape of the distribution (e.g., unimodal, normal, and skewed), a measure of central tendency (e.g., mean and median), and a measure of the spread of scores. The variance is an example of the latter measure. The importance of a measure for the spread of scores can be seen in the following example:

None

Both distributions have the same mean (

None
=
None
= 100), but the scores in distribution X cluster closer to the mean than those in distribution Y

Several measures can be used to describe the spread of scores. The range (highest score minus the lowest score) is simple and easy to understand but takes into account only the two outermost scores. One aberrant score can greatly affect the value of the range and give a false impression of how scores actually cluster together. The semi-interquartile range gets around this problem by considering only the central 50% of scores but ignores half the scores and is not a useful measure in inferential statistics. The most commonly used measures of spread of scores are the variance and the standard deviation. The standard deviation is merely the square root of the variance, and thus, it is the variance that is the important indicator.

The variance is commonly referred to as the average squared deviation from the mean. Its formula (using notation for a sample of scores, X)is

None

where

Capital S squared (S2) is the symbol for the variance;

None
(“X bar”) is the mean of the scores;

(X −

None
) indicates a deviation from the mean (how far away a score is from the mean);

The symbol ∑ (capital Greek letter sigma) is a direction “to sum” or “add”;

n is sample size; and

SS is the sum of the squared deviations from the mean (the numerator).

Notice several important aspects of the variance. The mean is the most commonly used measure of central tendency, and the variance is calculated by taking deviations from the mean. Thus, the variance shows how spread out scores are around the mean. Deviation scores are squared because the sum of the deviations from the mean, ∑(X

None
), always equals zero. An interesting feature of the variance is that the sum of the squared deviations from the mean, ∑(X −
None
)2, is a smaller value than the sum of the squared deviations taken from any other score.

Note also that because the sum of the squared deviations from the mean is divided by n, the variance itself is a type of mean: the mean of squared deviation scores. Finally, like the mean of the scores, the variance takes every score into account. This is generally considered a desirable quality, but in very skewed distributions or distributions with a few very aberrant scores, one might wish to use another measure.

As an example, here is the calculation of the variance for distribution X. The mean is

None

Next, take deviations from the mean, square them, and sum all of the squared deviations:

None

Then, divide by n to get the variance:

None

To return to original score units, calculate the standard deviation (Sx) by taking the square root of the variance:

None

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading