Skip to main content icon/video/no-internet

The sampling distribution of a statistic, S, gives the values of S and how often those values occur. The sampling distribution is a theoretical device that is the basis of statistical inference. One use is to determine if an observed S is rare or common.

Creating a Sampling Distribution

Recall that a statistic describes the sample and a parameter describes the population. First, one has the population. Then, one takes a random sample of size n from the population and finds the value of S. This value is the first value of the sampling distribution of the statistic. Take another random sample of size n from the population find the value of S. This value is the second value of the sampling distribution. This is repeated. The resulting collection of Ss is the sampling distribution of S. Figure 1 gives a visual representation of the process of creating a sampling distribution. The S value varies from sample to sample.

Figure 1 Model of Creating a Sampling Distribution

None

The sampling distribution of S enables one to see that variability.

The population size is either finite or infinite or large enough to be considered ‘infinite.’ When the population size is finite, that is, small, then a proper random sample would be taken from the population without replacement. So the number of possible random samples from the population of size N is

None

Example 1: Finite Population

Example 1 (created by author for this article) contains N = 6 items (presented in Table 1), and suppose we are interested in a sample size of 3, n = 3, so there are

None

unique random samples from the data set. The population median is 6, the population mean is 5.8, and the population standard deviation is 2.4.

Table 2 contains the 20 unique samples as well as the values for the following statistics: sample mean, sample median, and sample standard deviation. Figure 2 shows the sampling distributions of the sample mean, sample median, and sample standard deviation. Notice that the sample distributions are centered on the associated parameter that is, the sampling distribution of the median is centered on 6, and the same is true for the mean (centered at 5.8) and standard deviation (centered at 2.4). Also, notice that there is variability in the sampling distributions. This variability is known as sampling variability—the variability that is induced by the act of taking a random sample. The value of a statistic does not equal the value of the parameter (typically) but varies around the parameter.

Table 1 Example 1: Finite Population
245789
Table 2 The 20 Possible Random Samples of Size 3 and the Values of Three Statistics
SampleMedianMeanSt. Dev.SampleMedianMeanSt. Dev.
2 4 543.71.54 5 755.31.5
2 4 744.32.54 5 855.72.1
2 4 844.73.14 5 956.02.6
2 4 945.03.64 7 876.32.1
2 5 754.72.54 7 976.72.5
2 5 855.03.04 8 987.02.6
2 5 955.33.55 7 876.71.5
2 7 875.73.25 7 977.02.0
2 7 976.03.65 8 987.32.1
2 8 986.33.87 8 988.01.0

By looking at the sampling distribution, one can determine if an observed statistic is rare or common. For example, suppose one takes a random sample of three items and finds that the sample mean is 4 or below. If the random sample is from the original population, then the chance of observing an x equalto4or less is 1/20 = 0.05. So there is only a 5% chance that the random sample is from the original population.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading