Skip to main content icon/video/no-internet

Proper scientific sampling is an important element in the study of populations. The population of interest may be a general population, such as all people in the United States 18 years of age or older, or a targeted subpopulation, such as people in the United States 18 years of age or older living in poverty or living in urban areas. Many epidemiologic studies involve gathering information from such populations. The population of interest in such studies may be specific—such as physicians, people in homeless shelters, people with a specific chronic disease—or may involve any population that can be accurately defined. Drawing an appropriate sample of the target population is the foundation on which such studies must be built. An improper sample can negate everything that the study wishes to discover. This entry describes different techniques that can be used for drawing a proper sample and when they are most appropriate. The Further Readings section provides a more in-depth look at the specific statistical properties of various sampling techniques, such as variance estimation and the approximation of required sample sizes.

Simple Random Samples

Any discussion of sampling techniques must begin with the concept of a simple random sample. Virtually all statistics texts define a simple random sample as a way of picking a sample of size n from a population of size N in a manner that guarantees that all possible samples of size n have an equal probability of being selected. From an operational standpoint, consider a list of N population members. If one used a random number generator to assign a random number to each of the N population members on the list and then selected the n smallest numbers to make up one's sample of size n, this would constitute a valid simple random sample. If one replicated this process an infinite number of times, each possible sample of size n would be expected to be selected an equal number of times. This then meets the definition of a true simple random sample. Many statistical software packages contain a sampling module that usually applies this type of simple random sampling.

It is important to point out that virtually all statistical procedures described in textbooks or produced as output of statistical software programs assume simple random sampling was performed. This includes the estimation of means, variances, standard errors, and confidence intervals. It is also assumed in the estimation of standard errors around regression coefficients, correlation coefficients, odds ratios, and other statistical measures. In other words, the development of statistical estimation is built around the concept of simple random sampling. This does not imply that proper statistical estimates cannot be derived if simple random sampling was not performed, but that estimates derived by assuming simple random sampling may be incorrect if the sampling technique was something different.

In practice, it is usually the case that simple random sampling will lead to the best statistical properties. It usually will be found to produce the smallest variance estimates and smallest confidence intervals around estimates of interest. It also leads to the least analytic complexity as again all statistical software programs can handle simple random sampling without any problem. It must be noted that a simple random sample will not always produce the best statistical properties for estimation, but that in practice this usually will be the case.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading