Simpson's Paradox - Description

Description

Suppose two people, Lisa and Bart, each edit document articles for two weeks. In the first week, Lisa improves 0 of the 3 articles she edited, and Bart improves 1 of the 7 articles he edited. In the second week, Lisa improves 5 of 7 articles she edited, while Bart improves all 3 of the articles he edited.

Week 1 Week 2 Total
Lisa 0/3 5/7 5/10
Bart 1/7 3/3 4/10

Both times Bart improved a higher percentage of articles than Lisa, but the actual number of articles each edited (the bottom number of their ratios, also known as the sample size) were not the same for both of them either week. When the totals for the two weeks are added together, Bart and Lisa's work can be judged from an equal sample size, i.e. the same number of articles edited by each. Looked at in this more accurate manner, Lisa's ratio is higher and, therefore, so is her percentage. Also when the two tests are combined using a weighted average, overall, Lisa has improved a much higher percentage than Bart because the quality modifier had a significantly higher percentage. Therefore, like other paradoxes, it only appears to be a paradox because of incorrect assumptions, incomplete or misguided information, or a lack of understanding a particular concept.

Week 1 quantity Week 2 quantity Total quantity and weighted quality
Lisa 0% 71.4% 50%
Bart 14.2% 100% 40%

This imagined paradox is caused when the percentage is provided but not the ratio. In this example, if only the 14.2% in the first week for Bart was provided but not the ratio (1:7), it would distort the information causing the imagined paradox. Even though Bart's percentage is higher for the first and second week, when two weeks of articles is combined, overall Lisa had improved a greater proportion, 50% of the 10 total articles. Lisa's proportional total of articles improved exceeds Bart's total.

Here are some notations:

  • In the first week
  • — Lisa improved 0% of the articles she edited.
  • — Bart had a 14.2% success rate during that time.
Success is associated with Bart.
  • In the second week
  • — Lisa managed 71.4% in her busy life.
  • — Bart achieved a 100% success rate.
Success is associated with Bart.

On both occasions Bart's edits were more successful than Lisa's. But if we combine the two sets, we see that Lisa and Bart both edited 10 articles, and:

  • — Lisa improved 5 articles.
  • — Bart improved only 4.
  • — Success is now associated with Lisa.

Bart is better for each set but worse overall.

The paradox stems from the intuition that Bart could not possibly be a better editor on each set but worse overall. Pearl proved how this is possible, when "better editor" is taken in the counterfactual sense: "Were Bart to edit all items in a set he would do better than Lisa would, on those same items". Clearly, frequency data cannot support this sense of "better editor," because it does not tell us how Bart would perform on items edited by Lisa, and vice versa. In the back of our mind, though, we assume that the articles were assigned at random to Bart and Lisa, an assumption which (for a large sample) would support the counterfactual interpretation of "better editor." However, under random assignment conditions, the data given in this example are unlikely, which accounts for our surprise when confronting the rate reversal.

The arithmetical basis of the paradox is uncontroversial. If and we feel that must be greater than . However if different weights are used to form the overall score for each person then this feeling may be disappointed. Here the first test is weighted for Lisa and for Bart while the weights are reversed on the second test.

Lisa is a better editor on average, as her overall success rate is higher. But it is possible to have told the story in a way which would make it appear obvious that Bart is more diligent.

Simpson's paradox shows us an extreme example of the importance of including data about possible confounding variables when attempting to calculate causal relations. Precise criteria for selecting a set of "confounding variables," (i.e., variables that yield correct causal relationships if included in the analysis), is given in Pearl using causal graphs.

While Simpson's paradox often refers to the analysis of count tables, as shown in this example, it also occurs with continuous data: for example, if one fits separated regression lines through two sets of data, the two regression lines may show a positive trend, while a regression line fitted through all data together will show a negative trend, as shown on the picture above.

Read more about this topic:  Simpson's Paradox

Famous quotes containing the word description:

    As they are not seen on their way down the streams, it is thought by fishermen that they never return, but waste away and die, clinging to rocks and stumps of trees for an indefinite period; a tragic feature in the scenery of the river bottoms worthy to be remembered with Shakespeare’s description of the sea-floor.
    Henry David Thoreau (1817–1862)

    The next Augustan age will dawn on the other side of the Atlantic. There will, perhaps, be a Thucydides at Boston, a Xenophon at New York, and, in time, a Virgil at Mexico, and a Newton at Peru. At last, some curious traveller from Lima will visit England and give a description of the ruins of St. Paul’s, like the editions of Balbec and Palmyra.
    Horace Walpole (1717–1797)

    Everything to which we concede existence is a posit from the standpoint of a description of the theory-building process, and simultaneously real from the standpoint of the theory that is being built. Nor let us look down on the standpoint of the theory as make-believe; for we can never do better than occupy the standpoint of some theory or other, the best we can muster at the time.
    Willard Van Orman Quine (b. 1908)