|
|
Line 1: |
Line 1: |
| [[Image:Simpson's paradox continuous.svg|thumb|right|Simpson's paradox for continuous data: a positive trend appears for two separate groups (blue and red), a negative trend (black, dashed) appears when the data are combined.]]
| | 48 years old Metallurgical or Supplies Technician Ira from Camrose, loves sewing, health and fitness and cloud watching. Very recently had a family voyage to Fasil Ghebbi.<br><br>Feel free to surf to my site: [http://gymnastic-classifieds.com/ads/solid-weight-gain-guidance-you-can-implement-today/ Best weight Gainer at walmart] |
| <!--[[File:Public Domain Simpson's Paradox.gif|thumb|This clickable gif image shows an explicative example of Simpson's Paradox. Though the percentage of male students who obtained the scholarship for maths is higher than the percentage of female students who obtained that scholarship, and the percentage of male students who obtained the scholarship for physics is higher than the percentage of female students who obtained that scholarship, the percentage of male students who obtained a scholarship (for maths or for physics) is lower than the percentage of female students who obtained a scholarship.]]-->
| |
| | |
| In [[probability]] and [[statistics]], '''Simpson's paradox''', or the '''Yule–Simpson effect''', is a [[paradox]] in which a trend that appears in different groups of data disappears when these groups are combined, and the reverse trend appears for the aggregate data. This result is often encountered in social-science and medical-science statistics,<ref>{{cite journal
| |
| | title = Simpson's Paradox in Real Life
| |
| | author = Clifford H. Wagner
| |
| |date=February 1982
| |
| | journal = [[The American Statistician]]
| |
| | volume = 36
| |
| | issue = 1
| |
| | pages = 46–48
| |
| | doi = 10.2307/2684093
| |
| | jstor = 2684093
| |
| }}</ref> and is particularly confounding when frequency
| |
| data are unduly given causal interpretations.<ref name="pearl">[[Judea Pearl]]. ''Causality: Models, Reasoning, and Inference'', Cambridge University Press (2000, 2nd edition 2009). ISBN 0-521-77362-8.</ref> Simpson's Paradox disappears when causal relations are brought into consideration. Many statisticians believe that the mainstream public should be informed of the counter-intuitive results in statistics such as Simpson's paradox.<ref>Robert L. Wardrop (February 1995). "Simpson's Paradox and the Hot Hand in Basketball". ''The American Statistician'', ''' 49 (1)''': pp. 24–28.</ref><ref>Alan Agresti (2002). "Categorical Data Analysis" (Second edition). [[John Wiley and Sons]] ISBN 0-471-36093-7</ref>
| |
| | |
| [[Edward H. Simpson]] first described this phenomenon in a technical paper in 1951,<ref>{{cite journal
| |
| |title=The Interpretation of Interaction in Contingency Tables
| |
| | author = Simpson, Edward H.
| |
| | year = 1951
| |
| | journal = Journal of the Royal Statistical Society, Series B
| |
| | volume = 13
| |
| | pages = 238–241
| |
| }}</ref>
| |
| but the statisticians [[Karl Pearson]], et al., in 1899,<ref>
| |
| {{Cite journal
| |
| | last1 = Pearson | first1 = Karl
| |
| | author1-link = Karl Pearson
| |
| | last2 = Lee | first2 = A.
| |
| | last3 = Bramley-Moore | first3 = L.
| |
| | title = Genetic (reproductive) selection: Inheritance of fertility in man
| |
| | journal = [[Philosophical Transactions of the Royal Society A]]
| |
| | volume = 173
| |
| | pages = 534–539
| |
| | year = 1899
| |
| }}</ref>
| |
| and [[Udny Yule]], in 1903, had mentioned similar effects earlier.<ref>{{Cite journal
| |
| | title = Notes on the Theory of Association of Attributes in Statistics
| |
| | author = G. U. Yule
| |
| | year = 1903
| |
| | journal = [[Biometrika]]
| |
| | volume = 2
| |
| | pages = 121–134
| |
| | doi = 10.1093/biomet/2.2.121
| |
| | issue = 2
| |
| }}</ref>
| |
| The name ''Simpson's paradox'' was introduced by [[Colin R. Blyth]] in 1972.<ref name="blyth-72">{{cite journal
| |
| | title = On Simpson's Paradox and the Sure-Thing Principle
| |
| | author = Colin R. Blyth
| |
| |date=June 1972
| |
| | journal = Journal of the American Statistical Association
| |
| | volume = 67
| |
| | issue = 338
| |
| | pages = 364–366
| |
| | doi = 10.2307/2284382
| |
| | jstor = 2284382
| |
| }}</ref>
| |
| Since Edward Simpson did not actually discover this statistical paradox (an instance of [[Stigler's law of eponymy]]), some writers, instead, have used the impersonal names ''reversal paradox'' and ''amalgamation paradox'' in referring to what is now called ''Simpson's Paradox'' and the ''Yule–Simpson effect''.<ref>{{cite journal
| |
| | title = The Amalgamation and Geometry of Two-by-Two Contingency Tables
| |
| | author = [[I. J. Good]], Y. Mittal
| |
| | journal = [[The Annals of Statistics]]
| |
| |date=June 1987
| |
| | volume = 15
| |
| | issue = 2
| |
| | pages = 694–711
| |
| | doi = 10.1214/aos/1176350369
| |
| | issn = 0090-5364
| |
| | jstor=2241334
| |
| }}</ref>
| |
| | |
| ==Examples==
| |
| ===Kidney stone treatment===
| |
| This is a real-life example from a medical study<ref>{{Cite journal
| |
| | author = C. R. Charig, D. R. Webb, S. R. Payne, J. E. Wickham
| |
| | title = Comparison of treatment of renal calculi by open surgery, percutaneous nephrolithotomy, and extracorporeal shockwave lithotripsy
| |
| | journal = [[Br Med J (Clin Res Ed)]]
| |
| | volume = 292
| |
| | issue = 6524
| |
| | pages = 879–882
| |
| | pmid = 3083922
| |
| | date = 29 March 1986
| |
| | doi = 10.1136/bmj.292.6524.879
| |
| | pmc = 1339981
| |
| }}</ref> comparing the success rates of two treatments for [[kidney stone]]s.<ref>{{Cite journal
| |
| | author = Steven A. Julious and Mark A. Mullee
| |
| | title = Confounding and Simpson's paradox
| |
| | journal = [[BMJ]]
| |
| | pages = 1480–1481
| |
| | url = http://bmj.bmjjournals.com/cgi/content/full/309/6967/1480
| |
| | pmid = 7804052
| |
| | date = 12/03/1994 | volume = 309
| |
| | issue = 6967
| |
| | pmc = 2541623
| |
| }}</ref>
| |
| | |
| The table below shows the success rates and numbers of treatments for treatments involving both small and large kidney stones, where Treatment A includes all open surgical procedures and Treatment B is [[percutaneous nephrolithotomy]]. The numbers in parentheses indicate the number of success cases over the total size of the group. (For example, 93% equals 81 divided by 87.)
| |
| | |
| {| class="wikitable" summary="results accounting for stone size" style="margin-left:auto; margin-right:auto;"
| |
| !
| |
| ! Treatment A
| |
| ! Treatment B
| |
| |- align="center"
| |
| ! Small Stones
| |
| | ''Group 1''<br/>'''93% (81/87)''' || ''Group 2''<br/>87% (234/270)
| |
| |- align="center"
| |
| ! Large Stones
| |
| | ''Group 3''<br/>'''73% (192/263)''' || ''Group 4''<br/>69% (55/80)
| |
| |- align="center"
| |
| ! Both
| |
| | 78% (273/350) || '''83% (289/350)
| |
| |}
| |
| | |
| The paradoxical conclusion is that treatment A is more effective when used on small stones, and also when used on large stones, yet treatment B is more effective when considering both sizes at the same time. In this example the "lurking" variable (or '''[[confounding|confounding variable]]''') of the stone size was not previously known to be important until its effects were included.
| |
| | |
| Which treatment is considered better is determined by an inequality between two ratios (successes/total). The reversal of the inequality between the ratios, which creates Simpson's paradox, happens because two effects occur together:
| |
| # The sizes of the groups, which are combined when the lurking variable is ignored, are very different. Doctors tend to give the severe cases (large stones) the better treatment (A), and the milder cases (small stones) the inferior treatment (B). Therefore, the totals are dominated by groups 3 and 2, and not by the two much smaller groups 1 and 4.
| |
| # The lurking variable has a large effect on the ratios, i.e. the success rate is more strongly influenced by the severity of the case than by the choice of treatment. Therefore, the group of patients with large stones using treatment A (group 3) does worse than the group with small stones, even if the latter used the inferior treatment B (group 2).
| |
| Based on these effects, the paradoxical result can be rephrased more intuitively as follows: Treatment A, when applied to a patient population consisting mainly of patients with large stones, is less successful than Treatment B applied to a patient population consisting mainly of patients with small stones.
| |
| | |
| ===Berkeley gender bias case===
| |
| One of the best-known real-life examples of Simpson's paradox occurred when the [[University of California, Berkeley]] was sued for bias against women who had applied for admission to [[graduate school]]s there. The admission figures for the fall of 1973 showed that men applying were more likely than women to be admitted, and the difference was so large that it was unlikely to be due to chance.<ref name="freedman">David Freedman, Robert Pisani and Roger Curves. Statistics (3rd edition). W.W. Norton, 1998. ISBN 0-393-97083-3.</ref><ref name="Bickel">{{cite journal
| |
| | author = P.J. Bickel, E.A. Hammel and J.W. O'Connell
| |
| | year = 1975
| |
| | title = Sex Bias in Graduate Admissions: Data From Berkeley
| |
| | journal = [[Science (journal)|Science]]
| |
| | volume = 187
| |
| | pages = 398–404
| |
| | doi = 10.1126/science.187.4175.398
| |
| | pmid = 17835295
| |
| | issue = 4175
| |
| | url=http://www.sciencemag.org/cgi/content/abstract/187/4175/398
| |
| }}.</ref>
| |
| | |
| {| class="wikitable" style="margin-left:auto; margin-right:auto;"
| |
| |-
| |
| !
| |
| ! Applicants
| |
| ! Admitted
| |
| |-
| |
| ! Men
| |
| | 8442
| |
| | '''44%'''
| |
| |-
| |
| ! Women
| |
| | 4321
| |
| | 35%
| |
| |}
| |
| | |
| But when examining the individual departments, it appeared that no department was significantly biased against women. In fact, most departments had a "small but [[statistical significance|statistically significant]] bias in favor of women."<ref name="Bickel" /> The data from the six largest departments are listed below.
| |
| | |
| {| class="wikitable" style="margin-left:auto; margin-right:auto;"
| |
| |-
| |
| ! rowspan=2 | Department
| |
| ! colspan=2 | Men
| |
| ! colspan=2 | Women
| |
| |-
| |
| ! Applicants
| |
| ! Admitted
| |
| ! Applicants
| |
| ! Admitted
| |
| |-
| |
| ! A
| |
| | 825
| |
| | 62%
| |
| | 108
| |
| | '''82%'''
| |
| |-
| |
| ! B
| |
| | 560
| |
| | 63%
| |
| | 25
| |
| | '''68%'''
| |
| |-
| |
| ! C
| |
| | 325
| |
| | '''37%'''
| |
| | 593
| |
| | 34%
| |
| |-
| |
| ! D
| |
| | 417
| |
| | 33%
| |
| | 375
| |
| | '''35%'''
| |
| |-
| |
| ! E
| |
| | 191
| |
| | '''28%'''
| |
| | 393
| |
| | 24%
| |
| |-
| |
| ! F
| |
| | 373
| |
| | 6%
| |
| | 341
| |
| | '''7%'''
| |
| |}
| |
| | |
| The research paper by Bickel et al.<ref name="Bickel" /> concluded that women tended to apply to competitive departments with low rates of admission even among qualified applicants (such as in the English Department), whereas men tended to apply to less-competitive departments with high rates of admission among the qualified applicants (such as in [[engineering]] and [[chemistry]]). The conditions under which the admissions' frequency data from specific departments constitute a proper defense against charges of
| |
| discrimination are formulated in the book ''Causality'' by [[Judea Pearl|Pearl]].<ref name="pearl"/>
| |
| | |
| ===Low birth weight paradox===
| |
| {{Main|Low birth-weight paradox}}
| |
| | |
| The low birth weight paradox is an apparently [[paradox]]ical observation relating to the birth [[weight]]s and mortality of children born to [[tobacco smoking]] mothers. As a usual practice, babies weighing less than a certain amount (which varies between different countries) have been classified as having [[low birth weight]]. In a given population, babies with low birth weights have had a significantly higher [[infant mortality]] rate than others. However, it has been observed that babies of low birth weights born to [[Smoking and pregnancy|smoking mother]]s have a ''lower'' mortality rate than the babies of low birth weights of non-smokers.<ref>{{cite journal | author = Wilcox Allen | year = 2006 | title = The Perils of Birth Weight — A Lesson from Directed Acyclic Graphs | url = http://aje.oxfordjournals.org/cgi/content/abstract/164/11/1121 | journal = American Journal of Epidemiology | volume = 164 | issue = 11| pages = 1121–1123 | doi = 10.1093/aje/kwj276 | pmid = 16931545 }}</ref>
| |
| | |
| ===Batting averages===
| |
| A common example of Simpson's Paradox involves the [[batting average]]s of players in [[professional baseball]]. It is possible for one player to hit for a higher batting average than another player during a given year, and to do so again during the next year, but to have a lower batting average when the two years are combined. This phenomenon can occur when there are large differences in the number of at-bats between the years. (The same situation applies to calculating batting averages for the first half of the baseball season, and during the second half, and then combining all of the data for the season's batting average.)
| |
| | |
| A real-life example is provided by Ken Ross<ref>Ken Ross. "''A Mathematician at the Ballpark: Odds and Probabilities for Baseball Fans (Paperback)''"
| |
| Pi Press, 2004. ISBN 0-13-147990-3. 12–13</ref> and involves the batting average of two baseball players, [[Derek Jeter]] and [[David Justice]], during the baseball years 1995 and 1996:<ref>Statistics available from http://www.baseball-reference.com/ : [http://www.baseball-reference.com/j/jeterde01.shtml Data for Derek Jeter], [http://www.baseball-reference.com/j/justida01.shtml Data for David Justice].</ref>
| |
| | |
| {| class="wikitable" style="margin-left:auto; margin-right:auto;"
| |
| |-
| |
| !
| |
| ! colspan=2 | 1995
| |
| ! colspan=2 | 1996
| |
| ! colspan=2 | Combined
| |
| |-
| |
| | Derek Jeter
| |
| | 12/48
| |
| | .250
| |
| | 183/582
| |
| | .314
| |
| | 195/630
| |
| | '''.310'''
| |
| |-
| |
| | David Justice
| |
| | 104/411
| |
| | '''.253'''
| |
| | 45/140
| |
| | '''.321'''
| |
| | 149/551
| |
| | .270
| |
| |}
| |
| | |
| In both 1995 and 1996, Justice had a higher batting average (in bold type) than Jeter did. However, when the two baseball seasons are combined, Jeter shows a higher batting average than Justice. According to Ross, this phenomenon would be observed about once per year among the possible pairs of interesting baseball players. In this particular case, the Simpson's Paradox can still be observed if the year 1997 is also taken into account:
| |
| | |
| {| class="wikitable" style="margin-left:auto; margin-right:auto;"
| |
| |-
| |
| !
| |
| ! colspan=2 | 1995
| |
| ! colspan=2 | 1996
| |
| ! colspan=2 | 1997
| |
| ! colspan=2 | Combined
| |
| |-
| |
| | Derek Jeter
| |
| | 12/48
| |
| | .250
| |
| | 183/582
| |
| | .314
| |
| | 190/654
| |
| | .291
| |
| | 385/1284
| |
| | '''.300'''
| |
| |-
| |
| | David Justice
| |
| | 104/411
| |
| | '''.253'''
| |
| | 45/140
| |
| | '''.321'''
| |
| | 163/495
| |
| | '''.329'''
| |
| | 312/1046
| |
| | .298
| |
| |}
| |
| | |
| The Jeter and Justice example of Simpson's paradox was referred to in the "Conspiracy Theory" episode of the television series ''[[Numb3rs]]'', though a chart shown omitted some of the data, and listed the 1996 averages as 1995.{{citation needed|date=August 2012}}
| |
| | |
| If you use weighting this goes away. Normalise for the largest totals so that you are comparing the same thing.
| |
| {| class="wikitable" style="margin-left:auto; margin-right:auto;"
| |
| |-
| |
| !
| |
| ! colspan=3 | 1995
| |
| ! colspan=3 | 1996
| |
| ! colspan=2 | Combined
| |
| |-
| |
| | Derek Jeter
| |
| | 12/48*411
| |
| | 102.75/411
| |
| | .250
| |
| | 183/582*582
| |
| | 183/582
| |
| | .314
| |
| | 285.75/993
| |
| | .288
| |
| |-
| |
| | David Justice
| |
| | 104/411*411
| |
| | 104/411
| |
| | '''.253'''
| |
| | 45/140*582
| |
| | 187/582
| |
| | '''.321'''
| |
| | 291/993
| |
| | '''.293'''
| |
| |}
| |
| | |
| ==Description==
| |
| [[Image:Simpson's paradox.svg|thumb|250px|right|Illustration of Simpson's Paradox; The first graph (on the top) represents Lisa's contribution, the second one Bart's. The blue bars represent the first week, the red bars the second week; the triangles indicate the combined percentage of good contributions (weighted average). While Bart's bars both show a higher rate of success than Lisa's, Lisa's combined rate is higher because basically she improved a greater ratio relative to the quantity edited.]]
| |
| | |
| Suppose two people, Lisa and Bart, each edit document articles for two weeks. In the first week, Lisa improves 0 of the 3 articles she edited, and Bart improves 1 of the 7 articles he edited. In the second week, Lisa improves 5 of 7 articles she edited, while Bart improves all 3 of the articles he edited.
| |
| | |
| {| class="wikitable" summary="Lisa and Bart's modifications" style="margin-left:auto; margin-right:auto;"
| |
| !
| |
| ! Week 1
| |
| ! Week 2
| |
| ! Total
| |
| |- align=center
| |
| ! Lisa
| |
| | 0/3 || 5/7 || '''5/10'''
| |
| |- align=center
| |
| ! Bart
| |
| | '''1/7''' || '''3/3''' || 4/10
| |
| |}
| |
| | |
| Both times Bart improved a higher percentage of articles than Lisa, but the actual number of articles each edited (the bottom number of their ratios, also known as the ''[[sample size]]'') were not the same for both of them either week. When the totals for the two weeks are added together, Bart and Lisa's work can be judged from an equal sample size, i.e. the same number of articles edited by each. Looked at in this more accurate manner, Lisa's ratio is higher and, therefore, so is her percentage. Also when the two tests are combined using a weighted average, overall, Lisa has improved a much higher percentage than Bart because the quality modifier had a significantly higher percentage. Therefore, like other paradoxes, it only appears to be a paradox because of incorrect assumptions, incomplete or misguided information, or a lack of understanding a particular concept.
| |
| | |
| {| class="wikitable" summary="Lisa and Bart's modifications" style="margin-left:auto; margin-right:auto;"
| |
| !
| |
| ! Week 1 quantity
| |
| ! Week 2 quantity
| |
| ! Total quantity ''and'' weighted quality
| |
| |- align=center
| |
| ! Lisa
| |
| | 0% || 71.4% || '''50%'''
| |
| |- align=center
| |
| ! Bart
| |
| | '''14.2%''' || '''100%''' || 40%
| |
| |}
| |
| This imagined paradox is caused when the percentage is provided but not the ratio. In this example, if only the 14.2% in the first week for Bart was provided but not the ratio (1:7), it would distort the information causing the imagined paradox. Even though Bart's percentage is higher for the first and second week, when two weeks of articles is combined, overall Lisa had improved a greater proportion, 50% of the 10 total articles. Lisa's proportional total of articles improved exceeds Bart's total.
| |
| | |
| Here are some notations:
| |
| | |
| * In the first week
| |
| :* <math>S_L(1) = 0\%</math> — Lisa improved 0% of the articles she edited.
| |
| :* <math>S_B(1) = 14.2\%</math> — Bart had a 14.2% success rate during that time.
| |
| : Success is associated with Bart.
| |
| | |
| * In the second week
| |
| :* <math>S_L(2) = 71.4\%</math> — Lisa managed 71.4% in her busy life.
| |
| :* <math>S_B(2) = 100\%</math> — Bart achieved a 100% success rate.
| |
| : Success is associated with Bart.
| |
| | |
| On both occasions Bart's edits were more successful than Lisa's. But if we combine the two sets, we see that Lisa and Bart both edited 10 articles, and:
| |
| | |
| * <math>S_L = \begin{matrix}\frac{5}{10}\end{matrix}</math> — Lisa improved 5 articles.
| |
| * <math>S_B = \begin{matrix}\frac{4}{10}\end{matrix}</math> — Bart improved only 4.
| |
| * <math>S_L > S_B \,</math> — Success is now associated with Lisa.
| |
| | |
| Bart is better for each set but worse overall.
| |
| | |
| The paradox stems from the intuition that Bart could not possibly be a better editor on each set but worse overall. Pearl proved how this is possible, when "better editor" is taken in the counterfactual sense: "Were Bart to edit all items in a set he would do better than Lisa would, on those same items".<ref name="pearl"/> Clearly, frequency data cannot support this sense of "better editor," because it does not tell us how Bart would perform on items edited by Lisa, and vice versa. In the back of our mind, though, we assume that the articles were assigned at random to Bart and Lisa, an assumption which (for a large sample) would support the counterfactual interpretation of "better editor." However, under random assignment conditions, the data given in this example are unlikely, which accounts for our surprise when confronting the rate reversal.
| |
| | |
| The arithmetical basis of the paradox is uncontroversial. If <math>S_B(1) > S_L(1)</math> and <math>S_B(2) > S_L(2)</math> we feel that <math>S_B</math> ''must be greater'' than <math>S_L</math>. However if ''different'' weights are used to form the overall score for each person then this feeling may be disappointed. Here the first test is weighted <math>\begin{matrix}\frac{3}{10}\end{matrix}</math> for Lisa and <math>\begin{matrix}\frac{7}{10}\end{matrix}</math> for Bart while the weights are reversed on the second test.
| |
| | |
| * <math>S_L = \begin{matrix}\frac{3}{10}\end{matrix}S_L(1) + \begin{matrix}\frac{7}{10}\end{matrix}S_L(2)</math>
| |
| | |
| * <math>S_B = \begin{matrix}\frac{7}{10}\end{matrix}S_B(1) + \begin{matrix}\frac{3}{10}\end{matrix}S_B(2)</math>
| |
| | |
| Lisa is a better editor on average, as her overall success rate is higher. But it is possible to have told the story in a way which would make it appear obvious that Bart is more diligent.
| |
| | |
| Simpson's paradox shows us an extreme example of the importance of including data about possible confounding variables when attempting to calculate causal relations. Precise criteria for selecting a set of "confounding variables,"
| |
| (i.e., variables that yield correct causal relationships if included in the analysis),
| |
| is given in Pearl<ref name="pearl"/> using causal graphs.
| |
| | |
| While Simpson's paradox often refers to the analysis of count tables, as shown in this example, it also occurs with continuous data:<ref>John Fox (1997). "Applied Regression Analysis, Linear Models, and Related Methods". Sage Publications. ISBN 0-8039-4540-X. 136–137</ref> for example, if one fits separated [[linear regression|regression line]]s through two sets of data, the two regression lines may show a positive trend, while a regression line fitted through all data together will show a ''negative'' trend, as shown on the first picture.
| |
| | |
| ===Vector interpretation===
| |
| [[Image:Simpsons-vector.svg|thumb|Vector interpretation of Simpson's paradox]]
| |
| Simpson's paradox can also be illustrated using the 2-dimensional [[vector space]].<ref>{{cite journal | author = Kocik Jerzy | year = 2001 | title = Proofs without Words: Simpson's Paradox | url = http://www.math.siu.edu/kocik/papers/simpson2.pdf | format = PDF | journal = Mathematics Magazine | volume = 74 | issue = 5| page = 399 }}</ref> A success rate of <math>p/q</math> can be represented by a [[vector (geometry)|vector]] <math>\overrightarrow{A}=(q,p)</math>, with a [[slope]] of <math>p/q</math>. If two rates <math>p_1/q_1</math> and <math>p_2/q_2</math> are combined, as in the examples given above, the result can be represented by the sum of the vectors <math>(q_1, p_1)</math> and <math>(q_2, p_2)</math>, which according to the [[parallelogram rule]] is the vector <math>(q_1+q_2, p_1+p_2)</math>, with slope <math>\frac{p_1+p_2}{q_1+q_2}</math>.
| |
| | |
| Simpson's paradox says that even if a vector <math>\overrightarrow{b_1}</math> (in blue in the figure) has a smaller slope than another vector <math>\overrightarrow{r_1}</math> (in red), and <math>\overrightarrow{b_2}</math> has a smaller slope than <math>\overrightarrow{r_2}</math>, the sum of the two vectors <math>\overrightarrow{b_1} + \overrightarrow{b_2}</math> (indicated by "+" in the figure) can still have a larger slope than the sum of the two vectors <math>\overrightarrow{r_1} + \overrightarrow{r_2}</math>, as shown in the example.
| |
| | |
| ==Implications for decision making==
| |
| The practical significance of Simpson's paradox surfaces in decision making situations where it poses the following dilemma: Which data should we consult in choosing an action, the aggregated or the partitioned? In the Kidney Stone example above, it is clear that if one is diagnosed with "Small Stones" or "Large Stones" the data for the respective subpopulation should be consulted and Treatment A would be preferred to Treatment B. But what if a patient is not diagnosed, and the size of the stone is not known; would it be appropriate to consult the aggregated data and administer Treatment B? This would stand contrary to common sense; a treatment that is preferred both under one condition and under its negation should also be preferred when the condition is unknown.
| |
| | |
| On the other hand, if the partitioned data is to be preferred a priori, what prevents one from partitioning the data into arbitrary sub-categories (say based on eye color or post-treatment pain) artificially constructed to yield wrong choices of treatments? Pearl<ref name="pearl"/> shows that, indeed, in many cases it is the aggregated, not the partitioned data that gives the correct choice of action. Worse yet, given the same table, one should sometimes follow the partitioned and sometimes the aggregated data, depending on the story behind the data; with each story dictating its own choice. Pearl<ref name="pearl"/> considers this to be the real paradox behind Simpson's reversal.
| |
| | |
| As to why and how a story, not data, should dictate choices, the answer is that it is the story which encodes the causal relationships among the variables. Once we extract these relationships and represent them in a graph called a [[Bayesian Networks|causal Bayesian network]] we can test algorithmically whether a given partition, representing confounding variables, gives the correct answer. The test, called "back-door," requires that we check whether the nodes corresponding to the confounding variables intercept certain paths in the graph. This reduces Simpson's Paradox to an exercise in graph theory.
| |
| | |
| ==Psychology==
| |
| Psychological interest in Simpson's paradox seeks to explain why people deem sign reversal to be impossible at first. The question is where people get this strong intuition from, and how it is encoded in the mind. Simpson's paradox demonstrates that this intuition cannot be supported by probability calculus alone, and thus led philosophers to speculate that it is supported by an innate causal logic that guides people in reasoning about actions and their consequences. Savage's [[sure-thing principle]]<ref name="blyth-72"/> is an example of what such logic may entail. A qualified version of Savage's sure thing principle can indeed be derived from Pearl's ''do''-calculus<ref name="pearl"/> and reads: "An action ''A'' that increases the probability of an event ''B'' in each subpopulation ''C<sub>i</sub>'' of ''C'' must also increase the probability of ''B'' in the population as a whole, provided that the action does not change the distribution of the subpopulations." This suggests that knowledge about actions and consequences is stored in a form resembling Causal [[Bayesian Networks]].
| |
| | |
| ==Probability==
| |
| In a randomly selected '''2 × 2 × 2''' table, the Simpson's paradox will occur with [[probability]] approximately <sup>1</sup>/<sub>60</sub>.<ref>{{cite journal
| |
| | title = How Likely is Simpson's Paradox?
| |
| | author = Marios G. Pavlides and Michael D. Perlman
| |
| |date=August 2009
| |
| | journal = [[The American Statistician]]
| |
| | volume = 63
| |
| | issue = 3
| |
| | pages = 226–233
| |
| | doi = 10.1198/tast.2009.09007
| |
| }}</ref>
| |
| | |
| ==Related concepts==
| |
| * [[Ecological fallacy]] (and [[ecological correlation]])
| |
| * [[Modifiable areal unit problem]]
| |
| * [[Prosecutor's fallacy]]
| |
| | |
| ==References==
| |
| {{Reflist|2}}
| |
| | |
| == Bibliography ==
| |
| * Leila Schneps and Coralie Colmez, ''Math on trial. How numbers get used and abused in the courtroom'', Basic Books, 2013. ISBN 978-0-465-03292-1. (Sixth chapter: "Math error number 6: Simpson's paradox. The Berkeley sex bias case: discrimination detection").
| |
| | |
| ==External links==
| |
| {{Commons category|Simpson's paradox}}
| |
| | |
| *[[Stanford Encyclopedia of Philosophy]]: "[http://plato.stanford.edu/entries/paradox-simpson/ Simpson's Paradox]" – by Gary Malinas.
| |
| *[http://jeff560.tripod.com/s.html Earliest known uses of some of the words of mathematics: S]
| |
| **For a brief history of the origins of the paradox see the entries "Simpson's Paradox" and "Spurious Correlation"
| |
| * [[Judea Pearl|Pearl, Judea]], "[http://bayes.cs.ucla.edu/LECTURE/lecture_sec1.htm "The Art and Science of Cause and Effect.]" A slide show and tutorial lecture.
| |
| * [[Judea Pearl|Pearl, Judea]], [http://bayes.cs.ucla.edu/R264.pdf "Simpson's Paradox: An Anatomy"] ([[PDF]])
| |
| *[http://vudlab.com/simpsons/ Simpson's Paradox Visualized] - an interactive demonstration of Simpson's paradox.
| |
| * Short articles by Alexander Bogomolny at [[cut-the-knot]]:
| |
| ** "[http://www.cut-the-knot.org/blue/Mediant.shtml Mediant Fractions.]"
| |
| ** "[http://www.cut-the-knot.org/Curriculum/Algebra/SimpsonParadox.shtml Simpson's Paradox.]"
| |
| *[http://online.wsj.com/article/SB125970744553071829.html The Wall Street Journal column "The Numbers Guy"] for December 2, 2009 dealt with recent instances of Simpson's paradox in the news. Notably a Simpson's paradox in the comparison of unemployment rates of the 2009 recession with the 1983 recession. by Cari Tuna (substituting for regular columnist Carl Bialik)
| |
| | |
| {{Portal bar|Mathematics|Statistics}}
| |
| | |
| {{DEFAULTSORT:Simpson's Paradox}}
| |
| [[Category:Probability theory paradoxes]]
| |
| [[Category:Statistical paradoxes]]
| |
| [[Category:Causal inference]]
| |