Skip to main content icon/video/no-internet

Post-survey adjustments refer to a series of statistical adjustments applied to survey data prior to data analysis and dissemination. Although no universal definition exits, post-survey adjustments typically include data editing, missing data imputation, weighting adjustments, and disclosure limitation procedures.

Data Editing

Data editing may be defined as procedures for detecting and correcting errors in so-called raw survey data. Data editing may occur during data collection if the interviewer identifies obvious errors in the survey responses. As a component of post-survey adjustments, data editing involves more elaborate, systematic, and automated statistical checks performed by computers. Survey organizations and government statistical agencies may maintain special statistical software to implement data editing procedures.

Data editing begins by specifying a set of editing rules for a given editing task. An editing program is then designed and applied to the survey data to identify and correct various errors. First, missing data and “not applicable” responses are properly coded, based on the structure of the survey instrument. Second, range checks are performed on all relevant variables to verify that no invalid (out-of-range) responses are present. Out-of-range data are subject to further review and possible correction. Third, consistency checks are done to ensure that the responses to two or more data items are not in contradiction. In computerized data collection, such as Computer-Assisted Telephone Interviewing (CATI), Computer-Assisted Personal Interviewing (CAPI), or Web surveys, realtime data editing is typically built into the data collection system so that the validity of the data is evaluated as the data are collected.

Missing Data Imputation and Weighting Adjustments

Imputation and weighting adjustments are standard tools for dealing with missing data in surveys. Missing data result from two types of nonresponse: unit nonre-sponse and item nonresponse. Unit nonresponse occurs when no data are collected from a sampled unit, while item nonresponse occurs when no data are obtained for some items from a responding unit. In general, imputation is employed for item nonresponse, while weighting is reserved for unit nonresponse.

Imputation is the substitution of missing data with estimated values. Imputation produces a complete, rectangular data matrix that can support analyses where missing values might otherwise constrain what can be done. The statistical goal of imputation is to reduce the potential bias of survey estimates due to item nonresponse, which can only be achieved to the extent that the missing data mechanism is correctly identified and modeled.

Many imputation techniques are used in practice. Logical imputation or deductive imputation is used when a missing response can be logically inferred or deduced with certainty from other responses provided by the respondent. Logical imputation is preferred over other imputation methods because of its deterministic nature. Hot-deck imputation fills in missing data using responses from other respondents (donor records) that are considered similar to the respondents' missing data with respect to some characteristics. In cold-deck imputation, the donor may be from a data source external to the current survey. In regression imputation, a regression model is fitted for the variable with missing data and then used to predict the missing responses.

Both deck imputation and regression imputation could lead to underestimation of the variance, because the imputed values are either restricted to those that have been observed or they tend to concentrate at the center of the distribution. The multiple imputation method provides a valuable alternative imputation strategy because it supports statistical inferences that reflect the uncertainty due to imputation. Instead of filling in a single value for each missing response, the multiple imputation method replaces each missing value with a set of plausible values. The multiply imputed data sets are then analyzed together to account for the additional variance introduced by imputation.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading