Skip to main content icon/video/no-internet

Relationships observed for groups do not necessarily hold for individuals, and vice versa. The ecological fallacy is a fallacy in ecological studies that may arise when an investigator makes an inference about an individual based on aggregate data for a group. In ecological studies, we assess the relation between exposure rates and event rates at group level because we know only marginal distributions of exposure (risk factor) and outcome event and not their joint distributions. Researchers have made unwarranted inferences from the association between exposure to risk factor and outcome event among groups (ecological data) to association among individuals within each group without accounting for the possible ecological bias. Aggregating data loses information. The ecological fallacy may arise because the process of aggregating data may conceal the variations that are not visible at the larger aggregate level (see explanation in Example 2 below). Statistically, a correlation tends to be larger when an association is assessed at the group level than when it is assessed at the individual level.

Example 1: Nativity and Literacy

Robinson calculated the correlation coefficient between nativity (represented by the percentage of the population who are foreign-born) and literacy (represented by the percentage of the population who are literate) for the 48 states in the United States of 1930 to be .53, a positive correlation. This is an ecological correlation because the unit of observation and analysis is the state. But when computed at the individual level, the correlation coefficient turns out to be −.11, a negative coefficient! The fallacy arises because the foreign-born tend to live in states where the nativeborn are more literate.

Example 2: Blood Pressure and Stroke Mortality

In the Seven Countries Study, Menotti et al. (1997) found that the mean entry-level blood pressures and stroke mortality rates were highly inversely correlated for 16 cohorts of men aged 45 to 59 with 25-year follow-up. This is contrary to the expectation. So the analyses were repeated at the individual level within cohorts, the association between blood pressure and stroke mortality was then found to be strongly positive among most of the cohorts, and hence the correlation for all individuals should have been positive. The explanation of this paradox is that within each cohort, individuals who had had and had died from stroke tend to have had high blood pressure, but when the individual values in each cohort were averaged and the 16 pairs of average values were used to calculate the correlation, the cohorts with higher average blood pressures may have turned out to have smaller mortality rates simply because of the heterogeneity of correlations among the cohorts.

Example 3: Breast Cancer and Fat Consumption

Carroll found that death rates from breast cancer were significantly higher in countries in which fat consumption was high than in those in which fat consumption was low. This is an association for aggregate data, for the unit of observation is country. When inference is made to individual-level association, saying that if countries with more fat in the diet have higher rates of breast cancer, then women who eat fatty foods must be more likely to get breast cancer, an ecological fallacy may be committed because one cannot be certain that the breast cancer cases had high fat intakes. In fact, the problem of ecological fallacy on the link between breast cancer and fat intake was raised by Holmes et al. (1999) when they examined the individual-level data.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading