Causal inference in statistics: An overview

This review presents empiricalresearcherswith recent advances in causal inference, and stresses the paradigmatic shifts that must be un- dertaken in moving from traditionalstatistical analysis to causal analysis of multivariate data. Special emphasis is placed on the assumptions that un- derly all causal inferences, the languages used in formulating those assump- tions, the conditional nature of all causal and counterfactual claims, and the methods that have been developed for the assessment of such claims. These advances are illustrated using a general theory of causation based on the Structural Causal Model (SCM) described in Pearl (2000a), which subsumes and unifies other approaches to causation, and provides a coher- ent mathematical foundation for the analysis of causes and counterfactuals. In particular, the paper surveys the development of mathematical tools for inferring (from a combination of data and assumptions) answers to three types of causal queries: (1) queries about the effects of potential interven- tions, (also called "causal effects" or "policy evaluation") (2) queries about probabilities of counterfactuals, (including assessment of "regret," "attri- bution" or "causes of effects") and (3) queries about direct and indirect effects (also known as "mediation"). Finally, the paper defines the formal and conceptual relationships between the structural and potential-outcome frameworks and presents tools for a symbiotic analysis that uses the strong features of both.

[1]  K. Pearson,et al.  Mathematical Contributions to the Theory of Evolution. VI. Genetic (Reproductive) Selection: Inheritance of Fertility in Man, and of Fecundity in Thoroughbred Racehorses , 1899 .

[2]  G. Yule NOTES ON THE THEORY OF ASSOCIATION OF ATTRIBUTES IN STATISTICS , 1903 .

[3]  T. Haavelmo The Statistical Implications of a System of Simultaneous Equations , 1943 .

[4]  J BERKSON,et al.  Limitations of the application of fourfold table analysis to hospital data. , 1946, Biometrics.

[5]  E. H. Simpson,et al.  The Interpretation of Interaction in Contingency Tables , 1951 .

[6]  I NICOLETTI,et al.  The Planning of Experiments , 1936, Rivista di clinica pediatrica.

[7]  Guitton Henri Cowles commission for research in economics - Report for Period July 1, 1952 - June 30, 1954. , 1956 .

[8]  Robert H. Strotz,et al.  Recursive versus non-recursive systems: An attempt at a synthesis , 2017 .

[9]  H. Simon,et al.  Cause and Counterfactual , 1966 .

[10]  Michail Prodan,et al.  CHAPTER 17 – THE PLANNING OF EXPERIMENTS , 1968 .

[11]  P. Suppes A Probabilistic Theory Of Causality , 1970 .

[12]  C. Blyth On Simpson's Paradox and the Sure-Thing Principle , 1972 .

[13]  A. Goldberger,et al.  Structural Equation Models in the Social Sciences. , 1974 .

[14]  D. Rubin Estimating causal effects of treatments in randomized and nonrandomized studies. , 1974 .

[15]  A. Goldberger,et al.  Structural Equation Models in the Social Sciences. , 1974 .

[16]  O S Miettinen,et al.  Proportion of disease caused or prevented by a given exposure, trait or intervention. , 1974, American journal of epidemiology.

[17]  Stephen E. Fienberg,et al.  Discrete Multivariate Analysis: Theory and Practice , 1976 .

[18]  R. Plackett Discrete Multivariate Analysis: Theory and Practice , 1976 .

[19]  H. Simon,et al.  Causal Ordering and Identifiability , 1977 .

[20]  H. Theil Introduction to econometrics , 1978 .

[21]  A. Dawid Conditional Independence in Statistical Theory , 1979 .

[22]  M. R. Novick,et al.  The Role of Exchangeability in Inference , 1981 .

[23]  D. A. Kenny,et al.  Correlation and Causation , 1937, Wilmott.

[24]  D. A. Kenny,et al.  Correlation and Causation. , 1982 .

[25]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[26]  T. Speed,et al.  Recursive causal models , 1984, Journal of the Australian Mathematical Society. Series A. Pure Mathematics and Statistics.

[27]  J. Robins A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect , 1986 .

[28]  J M Robins,et al.  Identifiability, exchangeability, and epidemiological confounding. , 1986, International journal of epidemiology.

[29]  J. Robins A graphical approach to the identification and estimation of causal parameters in mortality studies with sustained exposure periods. , 1987, Journal of chronic diseases.

[30]  I. Good,et al.  The Amalgamation and Geometry of Two-by-Two Contingency Tables , 1987 .

[31]  P. Holland CAUSAL INFERENCE, PATH ANALYSIS AND RECURSIVE STRUCTURAL EQUATIONS MODELS , 1988 .

[32]  D. Francis An introduction to structural equation models. , 1988, Journal of clinical and experimental neuropsychology.

[33]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[34]  S Greenland,et al.  The probability of causation under a stochastic model for individual risk. , 1989, Biometrics.

[35]  J. Robins,et al.  Estimability and estimation of excess and etiologic fractions. , 1989, Statistics in medicine.

[36]  C. Manski Nonparametric Bounds on Treatment Effects , 1989 .

[37]  T. Speed,et al.  On the Application of Probability Theory to Agricultural Experiments. Essay on Principles. Section 9 , 1990 .

[38]  J. N. R. Jeffers,et al.  Graphical Models in Applied Multivariate Statistics. , 1990 .

[39]  D. Rubin [On the Application of Probability Theory to Agricultural Experiments. Essay on Principles. Section 9.] Comment: Neyman (1923) and Causal Inference in Experiments and Observational Studies , 1990 .

[40]  Judea Pearl,et al.  A Theory of Inferred Causation , 1991, KR.

[41]  James J. Heckman,et al.  Randomization and Social Policy Evaluation , 1991 .

[42]  S Greenland,et al.  Estimability and estimation of expected years of life lost due to a hazardous exposure. , 1991, Statistics in medicine.

[43]  J. Robins,et al.  Identifiability and Exchangeability for Direct and Indirect Effects , 1992, Epidemiology.

[44]  N. Wermuth,et al.  Linear Dependencies Represented by Chain Graphs , 1993 .

[45]  Joshua D. Angrist,et al.  Identification of Causal Effects Using Instrumental Variables , 1993 .

[46]  P. Spirtes,et al.  Causation, prediction, and search , 1993 .

[47]  C. Glymour,et al.  Conditioning and Intervening , 1994, The British Journal for the Philosophy of Science.

[48]  Judea Pearl,et al.  On the Testability of Causal Models With Latent and Instrumental Variables , 1995, UAI.

[49]  J. Pearl Causal diagrams for empirical research , 1995 .

[50]  James M. Robins,et al.  Probabilistic evaluation of sequential plans from causal models with hidden variables , 1995, UAI.

[51]  Judea Pearl,et al.  Counterfactuals and Policy Analysis in Structural Models , 1995, UAI.

[52]  James J. Heckman,et al.  Identification of Causal Effects Using Instrumental Variables: Comment , 1996 .

[53]  David Maxwell Chickering,et al.  A Clinician's Tool for Analyzing Non-Compliance , 1996, AAAI/IAAI, Vol. 2.

[54]  David W. Robertson,et al.  The Common Sense of Cause in Fact , 1997 .

[55]  J. Pearl,et al.  Bounds on Treatment Effects from Studies with Imperfect Compliance , 1997 .

[56]  J. Pearl Graphs, Causality, and Structural Equation Models , 1998 .

[57]  M. Sobel Causal Inference in Statistical Models of the Process of Socioeconomic Achievement , 1998 .

[58]  T. Shakespeare,et al.  Observational Studies , 2003 .

[59]  Michael I. Jordan Graphical Models , 1998 .

[60]  Manabu Kuroki,et al.  IDENTIFIABILITY CRITERIA FOR CAUSAL EFFECTS OF JOINT INTERVENTIONS , 1999 .

[61]  J. Pearl,et al.  Causal diagrams for epidemiologic research. , 1999, Epidemiology.

[62]  S Greenland,et al.  Relation of probability of causation to relative risk and doubling dose: a methodologic error that has become a social problem. , 1999, American journal of public health.

[63]  A. P. Dawid,et al.  Causal Inference Without Counterfactuals: Rejoinder , 2000 .

[64]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[65]  D. Allen Making things happen. , 2000, Nursing standard (Royal College of Nursing (Great Britain) : 1987).

[66]  D. Rubin Comment on "Causal inference without counterfactuals," by Dawid AP , 2000 .

[67]  J. Robins Data, Design, and Background Knowledge in Etiologic Inference , 2001, Epidemiology.

[68]  Jeffrey M. Wooldridge,et al.  Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data , 2003 .

[69]  Steffen L. Lauritzen,et al.  Causal Inference from Graphical Models , 2001 .

[70]  Blai Bonet,et al.  Instrumentality Tests Revisited , 2001, UAI.

[71]  Michael Wooldridge,et al.  Econometric Analysis of Cross Section and Panel Data, 2nd Edition , 2001 .

[72]  Judea Pearl,et al.  Direct and Indirect Effects , 2001, UAI.

[73]  Jin Tian,et al.  A general identification condition for causal effects , 2002, AAAI/IAAI.

[74]  D. Lindley Seeing and Doing: the Concept of Causation , 2002 .

[75]  S. Cole,et al.  Fallibility in estimating direct effects. , 2002, International journal of epidemiology.

[76]  D. Rubin,et al.  Principal Stratification in Causal Inference , 2002, Biometrics.

[77]  A. Dawid Influence Diagrams for Causal Modelling and Inference , 2002 .

[78]  Jeffrey M. Woodbridge Econometric Analysis of Cross Section and Panel Data , 2002 .

[79]  P. Shrout,et al.  Mediation in experimental and nonexperimental studies: new procedures and recommendations. , 2002, Psychological methods.

[80]  Tom Burr,et al.  Causation, Prediction, and Search , 2003, Technometrics.

[81]  J. Pearl Statistics and causal inference: A review , 2003 .

[82]  Nanny Wermuth,et al.  A general condition for avoiding effect reversal after marginalization , 2003 .

[83]  E. Arjas,et al.  Causal Reasoning from Longitudinal Data * , 2004 .

[84]  Jin Tian,et al.  Probabilities of causation: Bounds and identification , 2000, Annals of Mathematics and Artificial Intelligence.

[85]  N. Wermuth,et al.  Causality: a Statistical View , 2004 .

[86]  D. Rubin Direct and Indirect Causal Effects via Potential Outcomes * , 2004 .

[87]  Larry Wasserman,et al.  All of Statistics: A Concise Course in Statistical Inference , 2004 .

[88]  J. Woodward Making Things Happen: A Theory of Causal Explanation , 2003 .

[89]  Roger Brent,et al.  A Fishing Buddy for Hypothesis Generators , 2005, Science.

[90]  Chen Avin,et al.  Identifiability of Path-Specific Effects , 2005, IJCAI.

[91]  Manabu Kuroki,et al.  Variance Estimators for Three “Probabilities of Causation” , 2005, Risk analysis : an official publication of the Society for Risk Analysis.

[92]  Judea Pearl,et al.  Identification of Conditional Interventional Distributions , 2006, UAI.

[93]  Mark J van der Laan,et al.  Estimation of Direct Causal Effects , 2006, Epidemiology.

[94]  Matthew S. Fritz,et al.  Mediation analysis. , 2019, Annual review of psychology.

[95]  J. Robins,et al.  Four Types of Effect Modification: A Classification Based on Directed Acyclic Graphs , 2007, Epidemiology.

[96]  Judea Pearl,et al.  What Counterfactuals Can Be Tested , 2007, UAI.

[97]  D. Rubin The design versus the analysis of observational studies for causal effects: parallels with the design of randomized trials , 2007, Statistics in medicine.

[98]  H. White,et al.  An Extended Class of Instrumental Variables for the Estimation of Causal Effects , 2011 .

[99]  Christopher Winship,et al.  Counterfactuals and Causal Inference: Methods and Principles for Social Research , 2007 .

[100]  M. Sobel Identification of Causal Parameters in Randomized Studies With Mediating Variables , 2008 .

[101]  Judea Pearl,et al.  Dormant Independence , 2008, AAAI.

[102]  J. Heckman,et al.  Econometric Causality , 2008 .

[103]  Onyebuchi A Arah,et al.  The role of causal reasoning in understanding Simpson's paradox, Lord's paradox, and the suppression effect: covariate selection in the analysis of observational studies , 2008, Emerging themes in epidemiology.

[104]  Michael D. Perlman,et al.  How Likely Is Simpson’s Paradox? , 2009 .

[105]  D. Rubin Should observational studies be designed to allow lack of balance in covariate distributions across treatment groups? , 2009 .

[106]  Tyler J. VanderWeele,et al.  Marginal Structural Models for the Estimation of Direct and Indirect Effects , 2009, Epidemiology.

[107]  Judea Pearl,et al.  Letter to the Editor: Remarks on the Method of Propensity Score , 2009 .

[108]  Judea Pearl,et al.  Confounding Equivalence in Observational Studies , 2009 .

[109]  J. Pearl,et al.  Effects of Treatment on the Treated: Identification and Generalization , 2009, UAI.

[110]  Tin Kam Ho,et al.  A Regression Paradox for Linear Models: Sufficient Conditions and Relation to Simpson’s Paradox , 2009 .

[111]  鄭宇庭 行銷硏究 : Marketing research , 2009 .

[112]  J. Pearl Myth, Confusion, and Science in Causal Analysis , 2009 .

[113]  G. Smith,et al.  The social gradient in birthweight at term: quantification of the mediating role of maternal smoking and body mass index. , 2009, Human reproduction.

[114]  Sharon Schwartz,et al.  Opening the Black Box: a motivation for the assessment of mediation. , 2009, International journal of epidemiology.

[115]  L. Keele,et al.  Identification, Inference and Sensitivity Analysis for Causal Mediation Effects , 2010, 1011.1079.

[116]  Judea Pearl,et al.  Mediating Instrumental Variables , 2011 .

[117]  G. Gurtner,et al.  Statistics in medicine. , 2011, Plastic and reconstructive surgery.

[118]  Michael P. Murray,et al.  Instrumental Variables , 2011, International Encyclopedia of Statistical Science.

[119]  Sander Greenland,et al.  Causal Diagrams , 2011, International Encyclopedia of Statistical Science.