Skip to main content icon/video/no-internet

Design Effects (deff)

The design effect (deff) is a survey statistic computed as the quotient of the variability in the parameter estimate of interest resulting from the sampling design and the variability in the estimate that would be obtained from a simple random sample of the same size.

In large-scale sample surveys, inferences are usually based on the standard randomization principle of survey sampling. Under such an approach, the responses are treated as fixed, and the randomness is assumed to come solely from the probability mechanism that generates the sample. For example, in simple random sampling without replacement, the sample mean is unbiased with randomization-based variance given by

None

where n, N, and f = n/N denote the sample size, the population size, and the sampling fraction, respectively, and S2 is the finite population variance with the divisor N − 1. Usually f is negligible and can be dropped from the formula. In any such case, the equality displayed provides a conservative formula for the variance.

In most cases, however, complex sampling designs (indicated by the subscript CSD in the following) are applied rather than simple random sampling. In such a situation, None can still be an unbiased estimator under the usual randomization approach if the sampling design is one in which each sampling unit in the finite population has the same chance f of being selected. However, VSRS(None) usually underestimates the true randomization variance of None under the complex sampling design, say VCSD(None) To account for this underestimation, Leslie Kish proposed the following variance inflation factor, commonly known as the design effect:

None

where subscript R denotes the perspective of the randomization framework. Although in the vast majority of empirical applications, the design effect is considered for the usual sample mean, the ratio in Equation 1 can be denned more generally for the variances of any estimator, θ, under any complex design. In practice, DEFFR is unknown, and some approximations and estimations are employed to assess its magnitude.

To give an example, consider a population of N = 9 elements from which one wishes to select n = 3 into the sample. Let the yi, i = 1,…,9, values be given by 10, 18, 32, 11, 21, 33, 12, 21, 31. If one samples the elements using systematic sampling, as an instance of a complex sample design, exactly three samples are possible: s1 = [10,11,12], s2 = [18,21, 21], s3 = [32,33,31]. Given these extreme data, it can already be seen, without doing any calculations, that the variance of the sample mean is inflated compared to a simple random sample of three elements. If one calculates the variance of the sample mean given the systematic sample design (CSD = SYS), one gets

None

And, for the variance of the sample mean under simple random sampling,

None

Thus the design effect of this example is

None

which means that the variance of the sample mean, when choosing the sample by systematic sampling, is nearly 4 times as large as the variance of the same estimator under simple random sampling. This indicates a considerable loss of precision (i.e. larger variance for the same sample size).

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading