Simpson’s paradox, first defined by Edward H. Simpson in 1951, is a statistical phenomenon in which the association between two variables reverses or disappears when examining aggregate versus disaggregate data of a population via a third variable. Alternative known names of Simpson’s paradox are Yule effect, reversal paradox, or amalgamation paradox.

The practical implication to decision making that Simpson’s paradox raises is the question of which level of data aggregation presents the results of interest. This question further raises the challenge of identifying potential variables and then establishing a criterion for deciding if and which of the potential variables should influence the decision making.

Figure 1 Simpson’s paradox illustration for categorical cause and outcome variables

Simpson’s paradox is commonly defined for a categorical cause variable (C) and a ...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles