Skip to main content icon/video/no-internet

Post-Stratification

Stratification is a well-known sampling tool built on the premise that like units in a population should be treated similarly. It is a statistical fact that grouping similar units, when sampling, can generally reduce the variance of the survey estimates obtained. Stratification can be done when selecting units for study, or it can be carried out afterward. The latter application is usually termed post-stratification.

Illustration

To illustrate the differences between stratification and post-stratification, assume a researcher is interested in the total poundage of a population that consisted of 10,000 baby mice and one adult elephant. Suppose, further, that the average baby mouse weighted 0.2 pounds but the elephant weighted three and one-half tons or 7,000 pounds. This would mean, if the whole population were to be enumerated, that the researcher would obtain a total of

None

Now, if the researcher drew a sample of size n = 2 from this population, she or he would not get a very good estimate of the poundage, unless she or he took the elephant as one of the selections. So, naturally the researcher would stratify, taking the one elephant plus one of the mice at random.

If the researcher took account of population sizes, then she or he would multiply the poundage of the mouse selected by N1 = 10,000 and add it to the poundage of the N2 = 1 elephant, and the estimated total poundage over repeated samples would be 9,000 pounds (as shown in the preceding formula). Of course the individual mice vary in size (with a standard error of 0.005 pounds, say), so the researcher would not expect to hit the total “dead on” each time, but she or he might come very close, even with this small a sample (i.e. 9,000 pounds on the average would be the estimate with a standard error of 50 pounds).

How does this example change if the researcher post-stratified? Put another way, what if the researcher decided to stratify after, not before, she or he had selected the sample of n = 2? Suppose, to be specific, that the researcher had taken a simple random sample without separate strata for elephants and mice?

Well, first of all, of the (10,001) × (10,000)/2 or approximately 50 million samples of two elements, only 10,000 will have exactly one elephant and one mouse. All the other samples will have two mice, and there would be no way to get a good estimate from these samples no matter what the researcher did—a big price to pay for not stratifying before selection. To be specific, if two mice are selected, the expected estimate is

None

The remaining 10,000 samples, with one elephant and one mouse, do even more poorly unaided. For them the researcher will have an unadjusted expected estimate of

None

This second result, however, can be “saved” by post-stratification in the same way as was previously illustrated for stratification, since each sample has one mouse and one elephant. The calculations here are

None

In this case, the researcher gets back what the stratified estimator provided, but there is clearly a big risk that she or he might get a sample that would not be usable (i.e. that post-stratification cannot save).

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading