Skip to main content icon/video/no-internet

Survey data have been effectively used to provide suitable statistics for the target population and for many subpopulations, often called domains or areas. Domains may be geographical regions (e.g. states or counties), sociodemographic groups (e.g. nonwhite Hispanic women between 18 and 65 years) or other subpopulations. A domain or an area is considered “large” or “major” if the domain sample is sufficiently large so that it can provide a direct estimate of the domain parameter, for example, the mean, with adequate precision. A domain or an area is regarded as “small” if the domain-specific sample is not large enough to produce an estimate with reliable precision. Areas or domains with small samples are called small areas, small domains, local areas, subdomains, or substates.

Beginning in 1996, the U.S. Congress began to require that the Secretary of Commerce publish, at least biennially, current data related to incidence of poverty for states, counties, and local jurisdictions of government and school districts “to the extent feasible.” State and county estimates of the number of 5-to 17-year-old children in poverty and those among 65 and older are required. Poverty estimates for children are used to allocate federal and state funds, federal funds nearly $100 billion annually in recent years. As such, small area estimation is very important for the well-being of many citizens.

A Brief Primer on Important Terms in Small Area Estimation

For m small areas, suppose Yij,J = 1,Ni denote values of a response variable (Y) for the Ni units in the ith small area. Imagine one would like to estimate None the finite population mean. Suppose X is a vector of explanatory variables. If explanatory variables are available for all the sampled units in the ith small area, to be denoted for simplicity by 1, …,ni, then a unit-level model is used. But if only direct estimates Yi for yi and summary data x; for explanatory variables are available at the small area level, then an area-level model is used. If indirect small area estimates are produced by fitting a model relating the response variable and explanatory variables, and prediction of a small area mean is obtained by substituting explanatory variables into the estimated model, one gets a synthetic estimate, denoted by ŷ. Synthetic estimates are much too model dependent, susceptible to model failure, and not design-consistent. A composite estimate, which is a convex combination of yi and ŷis, rectifies these deficiencies. This entry considers only some of the basic aspects of small area estimation. For example, neither the time series and cross-sectional approach to small area estimation nor the interval estimation problem is considered here. For this and many other important topics, the advanced reader should consult J. N. K. Rao's Small Area Estimation.

Two Popular Small Area Models

Both linear and nonlinear models and both Bayesian and frequentist approaches are popular in small area estimation. While the estimated best linear unbiased prediction (EBLUP) approach is key to developing composite estimates based on mixed linear models, the empirical Bayes (EB) and the hierarchical Bayes (HB) approaches can be used for both linear and nonlinear models. Many model-based developments in small area estimation use normality assumptions. For normal linear models, the EBLUP and EB predictors of the small area means are identical.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading