Skip to main content icon/video/no-internet

The capture-recapture method is a technique for estimating the size of a population that cannot be directly measured in its entirety. It is derived from ecological research methods. To take a census of a group of animals (e.g., the population of fish in a pond), researchers capture a subset of animals, mark or tag them in some way, release them, then capture another sample (recapture). Some of the animals from the first sample will reappear in the second sample as well; some animals will appear only in one of the two captures. From this information, the size of the whole population can be estimated.

Capture-recapture methods have been adapted by epidemiologists for use in the surveillance of and identification of human illnesses. Routine surveillance methods are likely to fail to identify every affected person. Capture-recapture methods can be used to more completely identify the size of a given population. In epidemiology, the ‘capture’ involves identifying affected persons from lists, registers, or other sources of information about diagnosed cases of a given condition. The presence of an individual on one of these various sources is similar to the capture of an animal by an ecologist. Use of two or more overlapping but incomplete lists of cases, multiple ‘captures,’ allows for estimation of the number of cases missing from all lists, and from that an estimation of the total population size.

With two different sources, each person will appear on one, both, or neither list. The status of all cases on the lists can be summarized in a 2 × 2 table, where cell a represents presence on both lists, cell b represents presence on List 1 but not on List 2, cell c represents presence on List 2 but not on List 1, and cell d represents absence from both lists.

Table 1 Representation in a 2 × 2 Table of the Status of All Cases Identified From Two Independent Sources
On List 2?
YesNo
On List 1?Yesab
Nocd
Note: The number of cases not found on either list (cell d)may then be estimated from this table using the property that the crossproducts (ad and bc) will be the total estimated population size.

From this, we can use the property that the product of cells a and d will equal the product of cells b and c to solve for the unknown quantity in cell d. Specifically, d = bc= a. After estimating the frequency of cell d, the size of the entire population of interest may be estimated by summing all four cells a þ b þ cþ d = total estimated population size. However, this method assumes that Lists 1 and 2 are independent, that is, that the presence of a case on List 1 does not influence whether or not a case is on List 2. Using only two sources, this assumption cannot be tested and estimates cannot be adjusted for possible dependency between sources.

More than two sources may be used as well. With three sources, eight estimates can be produced, accounting for all possible interactions between and dependencies among sources. Methods exist for then selecting the best single estimate from the eight possible estimates. As the number of sources increases beyond three, however, the number of possible estimates quickly becomes unwieldy, though it is possible that some smaller sources could be combined to reduce the overall number of sources used in the estimation.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading