Skip to main content icon/video/no-internet

In many scientific research fields, statistical models are used to describe a system or a population, to interpret a phenomenon, or to investigate the relationship among various measurements. These statistical models often contain one or multiple components, called parameters, that are unknown and thus need to be estimated from the data (sometimes also called the sample). An estimator, which is essentially a function of the observable data, is biased if its expectation does not equal the parameter to be estimated.

To formalize this concept, suppose θ is the parameter of interest in a statistical model. Let

None
be its estimator based on an observed sample. Then
None
is a biased estimator if E (
None
) ≠ θ where E denotes the expectation operator. Similarly, one may say that
None
is an unbiased estimator if E (
None
) = θ. Some examples follow.

Example 1

Suppose an investigator wants to know the average amount of credit card debt of undergraduate students from a certain university. Then the population would be all undergraduate students currently enrolled in this university, and the population mean of the amount of credit card debt of these undergraduate students, denoted by θ, is the parameter of interest. To estimate θ, a random sample is collected from the university, and the sample mean of the amount of credit card debt is calculated. Denote this sample mean by

None
1. Then E(
None
)1) = θ; that is,
None
1 is an unbiased estimator. If the largest amount of credit card debt from the sample, call it
None
2, is used to estimate θ, then obviously
None
2 is biased. In other words, E(
None
2) ≠ θ.

Example 2

In this example a more abstract scenario is examined. Consider a statistical model in which a random variable X follows a normal distribution with mean μ and variance σ2, and suppose a random sample X1, …, Xn is observed. Let the parameter θ be μ. It is seen in Example 1 that

None
the sample mean of X1,…, Xn, is an unbiased estimator for θ. But
None
2 is a biased estimator for μ2 (or θ2). This is because X follows a normal distribution with mean μ and variance
None
. Therefore,
None

Example 2 indicates that one should be careful about determining whether an estimator is biased. Specifically, although

None
is an unbiased estimator for θ, g(
None
) may be a biased estimator for g(θ) if g is a nonlinear function. In Example 2, g(θ) = θ2 is such a function. However, when g is a linear function, that is, g(θ) = + b where a and b are two constants, then g(
None
) is always an unbiased estimator for g(θ).

Example 3

Let X1,…, Xn be an observed sample from some distribution (not necessarily normal) with mean μ and variance σ2. The sample variance S2, which is defined as

None
, is an unbiased estimator for σ2, while the intuitive guess
None
would yield a biased estimator. A heuristic argument is given here. If μ were known,
None
could be calculated, which would be an unbiased estimator for σ2. But since μ is not known, it has to be replaced by
None
. This replacement actually makes the numerator smaller. That
None
regardless of the value of μ. Therefore, the denominator has to be reduced a little bit (from n to n − 1) accordingly.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading