Skip to main content icon/video/no-internet

Multi-Level Integrated Database Approach (MIDA)

The multi-level integrated database approach (MIDA) is an enhancement to survey sampling that uses databases to collect as much information as practical about the target sample at both the case level and at various aggregate levels during the initial sampling stage. The goal of MIDA is to raise the final quality, and thus the accuracy, of survey data; it can do this in a variety of ways.

Building an MIDA

The following description of MIDA uses the example of national samples of U.S. households based on addresses and as such is directly appropriate for postal and in-person samples. However, similar approaches can be applied to other modes and populations (e.g. national random-digit dialing [RDD] telephone samples, panel studies, list-based samples, and local surveys).

The first step in MIDA is to extract all relevant public information at both the case level and aggregate levels from the sampling frame from which the sample addresses are drawn. In the United States, general population samples of addresses are typically nearly void of household-level information. However, U.S. address samples are rich in aggregate-level information. Address or location, of course, is the one known attribute of all cases, whether respondents or nonrespondents. Moreover, address-based sampling frames are typically based on the U.S. Census and as such the appropriate census data from blocks, tracts, place, and so on are part of the sampling frame and are linked to each address.

The second step is to augment the sampling frame by linking all cases in the sample to other databases. At the case level, that means linking the addresses to such sources as telephone directories, credit records, property records, voter registration lists, and many other public sources. The information obtained includes whether a match was or was not found (e.g. listed in telephone directory or not), and, if matched, whatever particular information is available (e.g. names, telephone numbers, credit reports, voter registration status).

At the aggregate level, this means merging information from sources other than those in the sampling frame. Examples of aggregate-level data beyond that from the census that could be appended are consumer information from such sources as Claritas's PRIZM NE and Donnelley Marketing's FIND Index, voting information from national elections, and data on such other matters as vital statistics, crime rates, religion, public housing, HIV/STD rates, and public welfare utilization.

The linked data include information from multiple levels of aggregation. The multi-level analysis starts with household-based data and includes neighborhood-level data from census tract and zip code-based data sources, community-level data from the census, election counts, crime rates, and other sources, and higher-level aggregations (e.g. metropolitan areas and census divisions).

The third step is to take information gained from the initial case-level linkages to secure additional information. For example, securing a name and telephone number from a telephone directory search can lead to households being found in databases when a mere address was insufficient to allow a match. Also, once a respondent is identified, links to that person in addition to household-level matching can be carried out. Thus, the process of augmenting the sampling frame is iterative and continues during the data collection phase.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading