Skip to main content icon/video/no-internet

Superpopulation

When data for a variable are gathered from a finite population and that variable is regarded to be a random variable, then the finite population is referred to as being “a realization from a superpopulation.” A superpopulation is the infinite population that elementary statistical textbooks often describe as part of the enumeration of a finite population. It is because sampling theory is based on making inference for a well-defined finite population that the concept of superpopulation is needed to differentiate between a finite population and an infinite superpopulation.

This distinction is important for two reasons: (1) Sampling theory estimation and inference can be based entirely on a finite population (in the absence of nonsampling errors), with no recourse to a superpopulation; and (2) even when a superpopulation is of primary interest (such as parameter estimation of the superpopulation model), the finite population may have been sampled in a way that distorts the original distribution of the finite population.

The superpopulation and the finite population concepts are compatible if one views the finite population labels (which are needed to allow specifie units to be sampled) to be part of the superpopulation model. Doing so, the final sample can be thought of as the result of a two-step process. First, the finite population is selected from a superpopulation according to the superpopulation model. Then, after each unit is identified with a label and related information, a sample design is formed and the final sample of units is selected. The measured characteristics are known only for the final sample. However, other information, such as the information used as part of the sample design, will be known for the entire finite population.

The superpopulation approach allows the use of additional model assumptions by specifying either a frequency distribution for the finite population characteristics or by specifying a prior distribution directly for them. Including this extra information as part of the inference often increases precision. A potential danger is that inference may be either biased, due to model misspecification, or inappropriate if prior information used is not shared by others.

A different but related concept is that of a nested sequence of populations that increases to an arbitrary large total. This has been used to demonstrate asymptotic properties of finite population estimates.

Donald J.Malec

Further Readings

Cassel, C.-M., Särndal, C.-E., & Wretman, J. H. (1977). Foundations of inference in survey sampling. New York: Wiley.
MadowW. G.On the limiting distributions of estimates based on samples from finite universes. Annals of Mathematical Statistics19 (1948) (4) 535–545. http://dx.doi.org/10.1214/aoms/1177730149
Skinner, C. J., Holt, D., & Smith, T. M. F. (Eds.). (1989). Analysis of complex surveys. New York: Wiley.
  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading