Skip to main content icon/video/no-internet

While many health-related outcomes are measured on a continuous basis (age, blood pressure, virus particles per milliliter of blood), many common types of epidemiologic data analysis are performed using categorical data. Categorical data are data that exist in discrete groupings. Categories may be predefined, such as gender (male or female), while others may be defined by cutpoints in the data, such as hypertensive (blood pressure greater than or equal to 120/80) or normotensive (blood pressure lower than 120/80). This type of classification results in having only two categories, as opposed to the full range of blood pressure measurements possible. Because these types of data are so common, a specialized field of statistical techniques has been developed. This section documents the key approaches for categorical data analysis used in epidemiology.

Simple Analysis

The 2 × 2 (‘two-by-two’) table is frequently analyzed in very basic epidemiologic studies. This approach is suitable for a study with two levels of exposure (e.g., exposed, not exposed), two categories of outcome (e.g., disease, no disease), and no other factors that might influence the association between exposure and outcome (i.e., no confounding or effect modification). Table 1 displays the common layout of a 2 × 2 table for an epidemiologic study.

The type of study design will determine the methods used to compute measures of interest, such as the frequency of the outcome for each exposure group, or an estimate of the association between exposure and outcome. This entry will first focus on the simple 2 × 2 table, and then cover stratified analysis of 2 × 2 tables to assess the influence of confounding and effect modification.

Simple (Crude) Analysis of Cohort Studies and Randomized Trials

In cohort studies, the investigator wants to determine if an exposure causes new cases of disease (or other outcome). While no single cohort study can definitively prove such a relationship, determining causation is the underlying goal of the research. Cohort studies provide at least two pieces of information that are useful in understanding causal relationships: (1) determining if an association between exposure and outcome exists (including strength of association) and (2) temporal association. At the beginning of the study period, individuals are classified as exposed or not exposed, and no participants have the outcome of interest. Participants are followed forward to determine who develops the outcome.

Table 1 Classic 2 × 2 Table for Epidemiologic Study Data Analysis
OutcomeNo OutcomeTotal
Exposedaba + b
Not exposedcdc + d
Totala + cb + dT = a + b + c + d

From a study design perspective, prospective cohort studies and randomized trials are very similar, primarily differing in how exposure status is assigned. In randomized trials, exposure is assigned by the researchers based on a random assignment protocol, while in cohort studies, exposure is not under the control of the experimenter. The same analytic methods are appropriate for both cohort studies and randomized trials.

A Simple Example with Equal Follow-Up Time

A classic example of a retrospective cohort study is a food poisoning outbreak at a wedding reception. The day after attending an outdoor reception catered by a local company, 65 of 130 guests had symptoms of gastroenteritis (i.e., vomiting, diarrhea, stomach cramps) that resulted in an emergency department visit for medical care. The epidemiologist interviewed all 130 individuals who attended the reception and determined what they ate, if they had symptoms of gastroenteritis, and when those symptoms began. All individuals were asked about symptoms on the wedding day and for the week after the wedding. The investigation identified another 7 individuals with symptoms meeting the case definition for the disease, bringing the total to 72 cases. Of these, 4 people had these symptoms the morning of the wedding. Because of the warm afternoon sun at the reception, consumption of potato salad was suspected as the source of infection (the exposure). Table 2 was designed for the data

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading