Skip to main content icon/video/no-internet

The percentile is a concept often used to summarize data and place the score or measurement taken on an individual into the context of a larger population. For any particular number p between 0 and 100, the pth percentile of a set of n measurements arranged in order of magnitude is the value that has at most p% of the observations below it and at most (100 − p)% above it. Roughly speaking, the first percentile is the number that divides the bottom 1% of the data from the top 99%; the second percentile is the number that divides the bottom 2% of the data from the top 98%; and so on. Therefore, if a man has a body mass index score at the 98th percentile for his age, it means roughly 98% of men his age have a body mass index score lower than him, and only 2% have a higher score.

A percentile may be viewed as the division of a data set into 100 equal parts. Smaller groupings are often used; for instance, the median of a data set is also the 50th percentile, which specifies that at least half the observations are equal or smaller than it. Other commonly used percentile groupings include deciles, which divide a data set into tenths (10 equal parts), quintiles, which divide a data set into fifths (5 equal parts), and quartiles, which divide a data set into quarters (4 equal parts). Of these, quartiles are the most commonly used.

Percentiles are often used to describe large data sets; for instance, in the body mass index example above, the percentiles may have been calculated using a sample of thousands of American men. However, researchers sometimes want to calculate percentiles, quartiles, and so on, for a smaller data set, in which case the following procedure may be used to establish cut points.

  • Arrange the observations into increasing order from smallest to largest.
  • Calculate the product of the sample size n and proportion φ you wish to include in each division (for quartiles, φ = 0:25; for deciles, φ = 0:10; etc.)
  • If np is an integer, say k, calculate the average of the kth and (k + 1)th ordered values; if np is not an integer, round it up to the next integer and find the corresponding ordered value.

For example, a study of serum total cholesterol (mg/L) levels recorded the following ordered levels for 20 adult patients (the data were adapted by the author from data presented in Ott and Longnecker (2001, p. 83).

To determine the first quartile, we take p = 0.25, and calculate np = (20)(0.25) = 5, then the first quartile is the average of the fifth and sixth observations,

Table 1 Serum Total Cholesterol (mg/L) Levels Recorded the Following Ordered Levels for 20 Adult Patients
Ordered ObservationCholesterol (mg/L)
1133
2137
3148
4149
5152
6167
7174
8179
9189
10192
11201
12209
13210
14211
15218
16238
17245
18248
19253
20257
Source: Adapted from data presented in Ott and Longnecker (2001, p. 83).

None

Therefore, data points falling at or below this cut point are in the first quartile of the data set. To calculate the cut point for the median, we take p = 0.5, and np = (20)(0.5) = 10, so the median is the average of the 10th and 11th

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading