Skip to main content icon/video/no-internet

In survey research, there are times when information is available on every unit in the population. If a variable that is known for every unit of the population is not a variable of interest but is instead employed to improve the sampling plan or to enhance estimation of the variables of interest, it is called an auxiliary variable.

Ratio and Regression Estimation

The term auxiliary variables is most commonly associated with the use of such variables, available for all units in the population, in ratio estimation, regression estimation, and extensions (calibration estimation).

The ratio estimator is a widely used estimator that takes advantage of an auxiliary variable to improve estimation. If x is the auxiliary variable and y is the variable of interest, let x and y denote the population totals for x and y and let None and Ŷ denote unbiased estimators of X and Y. Then the ratio estimator ŶR of Y is given by

None

ŶR improves upon Ŷ provided that the correlation between x and y exceeds one-half of Sx/None divided by Sy/None where Sx, Sy, None, and None are respectively the standard errors for x and y and the population means for x and y. The ratio estimator takes advantage of the correlation between x and y to well estimate Y/X by Ŷ/None and further takes advantage of X being known.

A more flexible estimator than the ratio estimator also taking advantage of the auxiliary variable x is the regression estimator:

None

where None is the estimated slope of y on x from the sample data. The regression estimator can be extended to make use of a vector, X, of auxiliary variables rather than a single one.

In the case of stratified sampling, the ratio and regression estimators have a number of variants. In the case of ratio estimation, the separate ratio estimator does ratio estimation at the stratum level and then sums across strata, whereas the combined ratio estimator estimates None and Ŷ across strata and then takes ratios.

Unequal Probability Sampling

In unequal probability sampling, the auxiliary variable x is termed a measure of size. The probability of selecting a unit is proportional to its measure of size. For example, in a survey of business establishments, the measure of size might be the number of employees or the total revenue of the establishment, depending on the purpose of the survey and the auxiliary information available. There are numerous sampling schemes for achieving selection probabilities proportional to the measure of size, one being unequal probability systematic sampling. Under general conditions, these schemes are more efficient than equal probability sampling when there is substantial variability in the size of the units in the population.

Stratification

It is often advantageous to divide a population into homogeneous groups called strata and to select a sample independently from each stratum. Auxiliary information on all population units is needed in order to form the strata. The auxiliary information can be a categorical variable (e.g. the county of the unit), in which case the categories or groups of categories form the strata. The auxiliary information could also be continuous, in which case cut points define the strata. For example, the income of a household or revenue of an establishment could be used to define strata by specifying the upper and lower limits of income or revenue for each stratum.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading