Skip to main content icon/video/no-internet

Unbiased Statistic

An unbiased statistic is a sample estimate of a population parameter whose sampling distribution has a mean that is equal to the parameter being estimated. Some traditional statistics are unbiased estimates of their corresponding parameters, and some are not. The simplest case of an unbiased statistic is the sample mean. Under the usual assumptions of population normality and simple random sampling, the sample mean is itself normally distributed with a mean equal to the population mean (and with a standard deviation equal to the population standard deviation divided by the square root of the sample size). A sample proportion is also an unbiased estimate of a population proportion. That is not surprising, as a proportion is a special kind of mean where all of the observations are 0s or 1s.

The matter is more complicated with regard to the sample variance. If the sum of the squared differences of the sample observations from the sample mean is divided by the sample size n, that statistic is not unbiased for estimating the population variance. To get an unbiased estimate of the population variance, the researcher needs to divide that sum of squared deviations by one less than the sample size.

The situation is even more complicated for the sample standard deviation. Although the sample variance obtained by dividing the sum of squares by n-1 provides an unbiased estimate of the population variance, the square root of that statistic is not an unbiased estimator of the square root of the population variance (i.e. the population standard deviation), despite some claims made in certain statistics textbooks. (In mathematical statistical jargon, the expected value [mean] of the square root of a statistic is not, in general, equal to the square root of the expected value of the original statistic.)

For bivariate normal distributions for which the Pearson product-moment correlation coefficient (r) is a measure of the direction and the degree of linear relationship between the two variables, the sample r does not have a normal sampling distribution and is not an unbiased estimate of its population counterpart. The principal reason for this is that r is “boxed in” between −1 and + 1. Fisher's z-transformation of r can be employed to partially remove the bias, and it is used frequently in testing hypotheses about population correlations and in establishing confidence intervals around sample correlations. The r is transformed to z (Fisher's z not standardized variable z), and the test is carried out in that metric. For the confidence interval, r is transformed to z, the interval is obtained for z, and the end points of the interval are transformed back to the r scale.

It perhaps should go without saying, but an unbiased statistic computed from a given sample is not always equal to the corresponding parameter. Unbiasedness is strictly a “long run” concept. Thus, although not necessarily equal to the population parameter for any given sample, the expected value of the statistic across repeated samples is, in fact, the parameter itself. On any given occasion, a biased statistic might actually be closer to the parameter than would an unbiased statistic. When estimating a population variance, for example, division of the sum of squares by n rather than n-1 might provide a “better” estimate. That statistic is the “maximum likelihood” estimator of a population variance. So care should be taken when choosing a statistic based on its bias and precision; that is, variability.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading