Skip to main content icon/video/no-internet

Sampling Distributions

Sampling distributions are the basis for making statistical inferences about a population from a sample. A sampling distribution is a set of samples from which some statistic is calculated. The distribution formed from the statistic computed from each sample is the sampling distribution. If the statistic computed is the mean, for example, then the distribution of means from each sample form the sampling distribution of the mean. One problem solved by sampling distributions is to provide a logical basis for using samples to make inferences about populations. Sampling distributions also provide a measure of variability among a set of sample means. This measure of variability will, in turn, allow one to estimate the likelihood of observing a particular sample mean collected in an experiment to test a hypothesis.

At the simplest level, when testing a hypothesis, one is testing whether an obtained sample comes from a known population. If the sample value is likely for the known population, then it is likely that the value must come from the known population. If the sample value is unlikely for the known population, then it likely does not come from the known population, and it can then be inferred that it comes from a different unknown population instead. If some treatment is performed, such as giving a drug to improve patient recovery time, then the average recovery time from a sample of the treated group will allow a test of the idea that the treatment had some effect (here on recovery time). Does giving patients this new drug in effect create a new and different population—a population using the drug? If the average recovery time of the treated group is very similar to, or likely for, the known population of patients that do not take the drug, then the treatment likely had no effect. If the average recovery rate is very different from, or very unlikely for, the known population of patients not taking the drug, then the treatment must have had an effect and created a new population of patients with different recovery times. Thus, some way to judge how likely a value is for the known population is needed.

The common formula used to find the probability or the likelihood of a value for a known population (solving z-score problems) is

None

In the previous formula, the standard deviation sigma (σ) provides information about how much variability exists in the population. Knowing how much variability exists in the population (the width of the distribution of scores) allows one to know how likely a single x value is for that population. Because most values in a distribution will lie close to the mean, less likely values will fall farther from the mean. The wider the distribution of scores, the less pronounced any specific difference between a value and the mean will be. For example, if the difference between an x value and the population mean remains constant (in the numerator), then that difference will be much more likely if the population has a very wide distribution (large denominator) compared with its likelihood in a very narrow distribution (small denominator). So, any factor, like a decreased spread in the distribution of scores, which increases the relative difference between a value and the mean will lower the estimate of how likely the value is for the distribution.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading