Skip to main content icon/video/no-internet

The mean is a parameter that measures the central location of the distribution of a random variable and is an important statistic that is widely reported in scientific literature. Although the arithmetic mean is the most commonly used statistic in describing the central location of the sample data, other variations of it, such as the truncated mean, the interquartile mean, and the geometric mean, may be better suited in a given circumstance. The characteristics of the data dictate which one of them should be used. Regardless of which mean is used, the sample mean remains a random variable. It varies with each sample that is taken from the same population. This entry discusses the use of mean in probability and statistics, differentiates between the arithmetic mean and its variations, and examines how to determine its appropriateness to the data.

Use in Probability and Statistics

In probability, the mean is a parameter that measures the central location of the distribution of a random variable. For a real-valued random variable, the mean, or more appropriately the population mean, is the expected value of the random variable. That is to say, if one observes the random variable numerous times, the observed values of the random variable would converge in probability to the mean. For a discrete random variable with a probability function p(y), the expected value exists if

None

where y is the values assigned by the random variable. For a continuous random variable with a probability density function f(y), the expected value exists if

None

Comparing Equation 1 with Equation 2, one notices immediately that the f(y)dy in Equation 2 mirrors the p(y) in Equation 1, and the integration in Equation 2 is analogous to the summation in Equation 1.

The above definitions help to understand conceptually the expected value, or the population mean. However, they are seldom used in research to derive the population mean. This is because in most circumstances, either the size of the population (discrete random variables) or the true probability density function (continuous random variables) is unknown, or the size of the population is so large that it becomes impractical to observe the entire population. The population mean is thus an unknown quantity.

In statistics, a sample is often taken to estimate the population mean. Results derived from data are thus called statistics (in contrast to what are called parameters in populations). If the distribution of a random variable is known, a probability model may be fitted to the sample data. The population mean is then estimated from the model parameters. For instance, if a sample can be fitted with a normal probability distribution model with parameters μ and σ, the population mean is simply estimated by the parameter μ (and σ2 as the variance). If the sample can be fitted with a Gamma distribution with parameters α and β, the population mean is estimated by the product of α and β (i.e., αβ), with αβ2 as the variance. For an exponential random variable with parameter β, the population mean is simply the β, with β2 as the variance. For a chi-square (χ2) random variable with v degrees of freedom, the population mean is v, with 2v being the variance.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading