Skip to main content icon/video/no-internet

Self-Selection Bias

Self-selection bias is the problem that very often results when survey respondents are allowed to decide entirely for themselves whether or not they want to participate in a survey. To the extent that respondents' propensity for participating in the study is correlated with the substantive topic the researchers are trying to study, there will be self-selection bias in the resulting data. In most instances, self-selection will lead to biased data, as the respondents who choose to participate will not well represent the entire target population.

A key objective of doing surveys is to measure empirical regularities in a population by sampling a much smaller number of entities that represent the whole target population. Modern sampling theory is predicated on the notion that whether an entity is eligible for interview should be determined by a random mechanism as implemented by the researcher that ensures that, for defined subpopulations formed by a partition of the entire population, the probability of selection is either proportional to the number in the subpopulation or, after weighting, weighted sample size is proportional to the number in the subpopulation. Further, the notion that sampling is random rules out selection based on behaviors or attributes about which the researchers are attempting to learn. For example, if researchers seek to learn about political affiliation, the sample will be compromised if the probability of inclusion varies by the respondent's political affiliation. Unfortunately, virtually all survey samples of human beings are self-selected to some degree due to refusal-related nonresponse among the sampled elements. In some cases this merely contributes negligible bias, whereas in others the bias is considerable.

The problem with self-selected samples comes when a respondent chooses to do a survey for reasons that are systematically related to the behaviors or attributes under study. The starting point for the literature on selectivity bias dates back more than 30 years to the work of labor economists. Central to the selectivity bias literature is that the seriousness and intractability of the problem increase when selection into the sample is driven not by exogenous or predetermined variables (under the researcher's control) but by unmeasured effects that also influence the behaviors and other variables the survey researchers want to learn about. In the latter case, the threat to validity is large when the rate of nonresponse is also large. An all-volunteer sample is the worst case of nonresponse bias when no one is selected based upon a scientific sampling rule. Consequently, threats to validity peak with self-selected samples—a category into which, for example, far too many Internet polls fall. The goal of sampling is to reduce the scope for people to opt into a study based upon the measures under study. Thus, respondents should be chosen for a survey sample based upon some mechanism that is well understood and statistically independent of the researchers' measurement protocol.

When the respondent chooses the study rather than the study choosing the respondent, the respondent may opt into a study based upon predetermined, observable characteristics, such as age, race, sex, or region of origin or, more dangerously, based upon some characteristic that is respondent determined (or at least heavily influenced), such as political ideology, hours worked, religiosity, or other attitudes. When respondents choose a survey for reasons related only to their demographic characteristics, such as age, race, or sex, the damage to randomness often can be “undone” by judicious post-stratification weighting, so long as researchers know the correct universe estimates for these characteristics. However, when omitted variables affect both the propensity to volunteer and the measures under study, the situation becomes difficult, requiring substantial structure to undo the damage of a self-selected sample.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading