Probability of Selection

Paul J.Lavrakas

doi:10.4135/9781412963947

Entry
Reader's guide
Entries A-Z
Subject index

Return to Entries

Probability of Selection

Edited by:
Paul J. Lavrakas
In:Encyclopedia of Survey Research Methods
Chapter DOI:https://doi.org/10.4135/9781412963947.n404
Subject:Survey Research

Request Permissions

Show page numbers Hide page numbers

In survey sampling, the term probability of selection refers to the chance (i.e. the probability from 0 to 1) [Page 618]that a member (element) of a population can be chosen for a given survey. When a researcher is using a probability sample, the term also means that every member of the sampling frame that is used to represent the population has a known nonzero chance of being selected. That chance can be calculated as a member's probability of being chosen out of all the members in the population. For example, a chance of 1 out of 1,000 is a probability of 0.001 (1/1,000 − 0.001). Since every member in a probability sample has some chance of being selected, the calculated probability is always greater than zero. Because every member has a known chance of being selected, it is possible to compute representative unbiased estimates of whatever a researcher is measuring with the sample. Researchers are able to assume with some degree of confidence that whatever they are estimating represents that same parameter in the larger population from which they drew the sample. For nonprobability samples (such as quota samples, intercept samples, snowball samples, or convenience samples), it is not feasible to confidently assess the reliability of survey estimates, since the selection probability of the sample members is unknown.

In order to select a sample, researchers generally start with a list of elements, such as addresses or telephone numbers. This defined list is called the “sampling frame.” It is created in advance as a means to select the sample to be used in the survey. The goal in building the sampling frame is to have it be as inclusive as possible of the larger (target) population that it covers. As a practical reality, sample frames can suffer from some degree of undercoverage and may be plagued with duplication. Undercoverage leads to possible coverage error, whereas duplication leads to unequal probabilities of selection because some elements have more than one chance of being selected. Minimizing and even eliminating duplication may be possible, but undercoverage may not be a solvable problem, in part because of the cost of the potential solution(s).

In designing a method for sampling, the selection probability does not necessarily have to be the same (i.e. equal) for each element of the sample as it would be in a simple random sample. Some survey designs purposely oversample members from certain subclasses of the population to have enough cases to compute more reliable estimates for those subclasses. In this case, the subclass members have higher selection probabilities by design; however, what is necessary in a probability sample is that the selection probability is knowable.

Depending on the method of data collection, the final selection probability may not be known at the outset of data collection. For example, in household surveys, such as those selected via random-digit dialing (RDD), additional information such as the number of eligible household members needs to be collected at the time of contact in order to accurately compute the final selection probability. The more eligible members in the household, the lower is the selection probability of any one member; for example, in a household with a wife, husband, and two adult children, each has a probability of selection within their household of 1/4. Furthermore, in RDD landline telephone surveys of the general public, it is common to ask how many working telephone numbers are associated with a household. If there are two working landline telephone numbers, then the household has twice the chances of being selected compared to households with only one working landline number, and thus a weighting adjustment can be made for households with two or more numbers. Similarly, in mail survey questionnaires that are not sampling specifically named people, a question about household size regarding eligible members is generally asked. In a systematic sample (e.g. exit polls), the probability of selection is the inverse of the sampling interval.

...

Sign in to access this content

Get a 30 day FREE TRIAL

Watch videos from a variety of sources bringing classroom topics to life
Read modern, diverse business cases
Explore hundreds of books and reference titles

No internet connection.

All search filters on the page have been cleared.

Your search has been saved.

Entry

Reader's guide

Entries A-Z

Subject index

Probability of Selection

Sign in to access this content

Get a 30 day FREE TRIAL

Read next

More like this

Sage Recommends