Skip to main content icon/video/no-internet

The median is one of the location parameters in probability theory and statistics. (The others are the mean and the mode.) For a real valued random variable X with a cumulative distribution function F, the median of X is the unique number that satisfies F(−m)≤ ½ ≥ F(m). In other words, the median is the number that separates the upper half from the lower half of a population or a sample. If a random variable is continuous and has a probability density function, half of the area under the probability density function curve would be to the left of m and the other half to the right of m. For this reason, the median is also called the 50th percentile (the zth percentile is the value such that i% of the observations are below it). In a box plot (also called box-and-whisker plot), the median is the central line between the lower and the higher hinge of the box. The location of this central line suggests the central tendency of the underlying data.

The population median, like the population mean, is generally unknown. It must be inferred from the sample median, just like the use of the sample mean for inferring the population mean. In circumstances in which a sample can be fitted with a known probability model, the population median may be obtained directly from the model parameters. For instance, a random variable that follows an exponential distribution with a scale parameter β (a scale parameter is the one that stretches or shrinks a distribution), the median is βln2 (where ln means natural logarithm, which has a base e = 2.718281828). If it follows a normal distribution with a location parameter μ and a scale parameter σ, the median is μ. For a random variable following a Weibull distribution with a location parameter μ, a scale parameter α, and a shape parameter γ (a shape parameter is the one that changes the shape of a distribution), the median is

None
. However, not all distributions have a median in closed form. Their population median cannot be obtained directly from a probability model but has to be estimated from the sample median.

Definition and Calculation

The sample median can be defined similarly, irrespective of the underlying probability distribution of a random variable. For a sample of n observations, x1,x2, … xn, taken from a random variable X, rank these observations in an ascending order from the smallest to the largest in value; the sample median, m, is defined as

None

That is, the sample median is the value of the middle observation of the ordered statistics if the number of observations is odd or the average of the value of the two central observations if the number of observations is even. This is the most widely used definition of the sample median.

According to Equation 1, the sample median is obtained from order statistics. No arithmetical summation is involved, in contrast to the operation of obtaining the sample mean. The sample median can therefore be used on data in ordinal, interval, and ratio scale, whereas the sample mean is best used on data in interval and ratio scale because it requires first the summation of all values in a sample.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading