Skip to main content icon/video/no-internet

Variance, or dispersion, roughly refers to the degree of scatter or variability among a collection of observations. For example, in a survey regarding the effectiveness of a political leader, ratings from individuals will differ. In a survey dealing with reading ability among children, the expectation is that children will differ. Even in the physical sciences, measurements might differ from one occasion to the next because of the imprecision of the instruments used. In a very real sense, it is this variance that motivates interest in statistical techniques.

A basic issue that researchers face is deciding how variation should be measured when trying to characterize a population of individuals or things. That is, if all individuals of interest could be measured, how should the variation among these individuals be characterized? Such measures are population measures of variation. A related issue is deciding how to estimate a population measure of variation based on a sample of individuals.

Choosing a measure of dispersion is a complex issue that has seen many advances during the past 30 years, and more than 150 measures of dispersion have been proposed. The choice depends in part on the goal of the investigator, with the optimal choice often changing drastically depending on what an investigator wants to know or do. Although most of these measures seem to have little practical value, at least five or six play an important and useful role.

Certainly the best-known measure of dispersion is the population variance, which is typically written as σ2. It is the average (or expected) value of the squared difference between an observation and the population mean. That is, if all individuals in a population could be measured (as in a complete census), the average of their responses is called the population mean, μ, and if for every observation the squared difference between it and μ were computed, the average of these squared values is σ2. In more formal terms, σ2 = E(X − μ)2, where X is any observation the investigator might make and E stands for expected value. The (positive) square root of the variance, σ, is called the (population) standard deviation.

Based on a simple random sample of n individuals, if the investigator observes the values X1,…, Xn, the usual estimate of σ2 is the sample variance:

None

where None is the sample mean.

For some purposes, the use of the standard deviation stems from the fundamental result that the probability of an observation being within some specified distance from the mean, as measured by σ, is completely determined under normality. For example, the probability that an observation is within one standard deviation of the mean is .68, and the probability of being within two standard deviations is .954. These properties have led to a commonly used measure of effect size (a measure intended to characterize the extent to which two groups differ) as well as a frequently employed rule for detecting outliers (unusually large or small values). Shortly after a seminal paper by J. W. Tukey in 1960, it was realized that even very small departures from normality can alter these properties substantially, resulting in practical problems that commonly occur.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading