Causal Inference and Data Fusion in Econometrics

Learning about cause and effect is arguably the main goal in applied econometrics. In practice, the validity of these causal inferences is contingent on a number of critical assumptions regarding the type of data that has been collected and the substantive knowledge that is available. For instance, unobserved confounding factors threaten the internal validity of estimates, data availability is often limited to non-random, selection-biased samples, causal effects need to be learned from surrogate experiments with imperfect compliance, and causal knowledge has to be extrapolated across structurally heterogeneous populations. A powerful causal inference framework is required to tackle these challenges, which plague most data analysis to varying degrees. Building on the structural approach to causality introduced by Haavelmo (1943) and the graph-theoretic framework proposed by Pearl (1995), the artificial intelligence (AI) literature has developed a wide array of techniques for causal learning that allow to leverage information from various imperfect, heterogeneous, and biased data sources (Bareinboim and Pearl, 2016). In this paper, we discuss recent advances in this literature that have the potential to contribute to econometric methodology along three dimensions. First, they provide a unified and comprehensive framework for causal inference, in which the aforementioned problems can be addressed in full generality. Second, due to their origin in AI, they come together with sound, efficient, and complete algorithmic criteria for automatization of the corresponding identification task. And third, because of the nonparametric description of structural models that graph-theoretic approaches build on, they combine the strengths of both structural econometrics as well as the potential outcomes framework, and thus offer a perfect middle ground between these two competing literature streams.

[1]  Rosa L. Matzkin Nonparametric Identification in Structural Economic Models , 2013 .

[2]  Judea Pearl,et al.  Identification of Conditional Interventional Distributions , 2006, UAI.

[3]  J. Pearl Reflections on Heckman and Pinto's Causal Analysis After Haavelmo , 2013 .

[4]  D. Rubin Estimating causal effects of treatments in randomized and nonrandomized studies. , 1974 .

[5]  J. Heckman Sample selection bias as a specification error , 1979 .

[6]  Petra E. Todd,et al.  Matching As An Econometric Evaluation Estimator , 1998 .

[7]  Scott E. Page Computational models from A to Z , 1999, Complex..

[8]  Guido W. Imbens Instrumental Variables: An Econometrician’s Perspective , 2014 .

[9]  Elias Bareinboim,et al.  Transportability of Causal Effects: Completeness Results , 2012, AAAI.

[10]  R. H. Strotz,et al.  RECURSIVE VS. NONRECURSIVE SYSTEMS: AN ATTEMPT AT SYNTHESIS (PART I OF A TRIPTYCH ON CAUSAL CHAIN SYSTEMS) , 1960 .

[11]  Steven D. Levitt,et al.  Sample Selection in the Estimation of Air Bag and Seat Belt Effectiveness , 1999, Review of Economics and Statistics.

[12]  James J. Heckman,et al.  Econometric Evaluation of Social Programs, Part I: Causal Models, Structural Models and Econometric Policy Evaluation , 2007 .

[13]  Joseph Y. Halpern Axiomatizing Causal Reasoning , 1998, UAI.

[14]  Johannes Textor,et al.  DAGitty: a graphical tool for analyzing causal diagrams. , 2011, Epidemiology.

[15]  Elias Bareinboim,et al.  Transportability of Causal and Statistical Relations: A Formal Approach , 2011, 2011 IEEE 11th International Conference on Data Mining Workshops.

[16]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[17]  Petra E. Todd,et al.  Matching As An Econometric Evaluation Estimator: Evidence from Evaluating a Job Training Programme , 1997 .

[18]  Joshua D. Angrist,et al.  Lifetime Earnings and the Vietnam Era Draft Lottery: Evidence from Social Security Administrative Records , 1990 .

[19]  J. Pearl,et al.  Bounds on Treatment Effects from Studies with Imperfect Compliance , 1997 .

[20]  Amanda E. Kowalski How to Examine External Validity within an Experiment , 2018, Journal of economics & management strategy.

[21]  Maciej Liskiewicz,et al.  Adjustment Criteria in Causal Diagrams: An Algorithmic Perspective , 2011, UAI.

[22]  Elias Bareinboim,et al.  External Validity: From Do-Calculus to Transportability Across Populations , 2014, Probabilistic and Causal Inference.

[23]  Jin Tian,et al.  Generalized Adjustment Under Confounding and Selection Biases , 2018, AAAI.

[24]  T. Haavelmo The Statistical Implications of a System of Simultaneous Equations , 1943 .

[25]  Jin Tian,et al.  A general identification condition for causal effects , 2002, AAAI/IAAI.

[26]  Elias Bareinboim,et al.  Transportability from Multiple Environments with Limited Experiments: Completeness Results , 2014, NIPS.

[27]  S. Mumford,et al.  Causation: A Very Short Introduction , 2013 .

[28]  Rosa L. Matzkin NONPARAMETRIC IDENTIFICATION , 2012 .

[29]  Elias Bareinboim,et al.  General Identifiability with Arbitrary Surrogate Experiments , 2019, UAI.

[30]  J. Pearl Causal diagrams for empirical research , 1995 .

[31]  Jiji Zhang,et al.  Causal Inference and Reasoning in Causally Insu-cient Systems , 2006 .

[32]  Matthew Crosby,et al.  Association for the Advancement of Artificial Intelligence , 2014 .

[33]  Jiji Zhang,et al.  Causal Identification under Markov Equivalence , 2018, UAI.

[34]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[35]  J. Pearl,et al.  The Book of Why: The New Science of Cause and Effect , 2018 .

[36]  B. Hansen,et al.  On Recursiveness and Interdependency in Economic Models , 1954 .

[37]  J. Heckman The Common Structure of Statistical Models of Truncation, Sample Selection and Limited Dependent Variables and a Simple Estimator for Such Models , 1976 .

[38]  Mary S. Morgan,et al.  The stamping out of process analysis in econometrics , 1991 .

[39]  Elias Bareinboim,et al.  From Statistical Transportability to Estimating the Effect of Stochastic Interventions , 2019, IJCAI.

[40]  J. Kmenta Mostly Harmless Econometrics: An Empiricist's Companion , 2010 .

[41]  Vasant Honavar,et al.  Transportability from Multiple Environments with Limited Experiments , 2013, NIPS.

[42]  Jiji Zhang,et al.  A Graphical Criterion for Effect Identification in Equivalence Classes of Causal Diagrams , 2018, IJCAI.

[43]  Brett R. Gordon,et al.  A Comparison of Approaches to Advertising Measurement: Evidence from Big Field Experiments at Facebook , 2018, Mark. Sci..

[44]  R. L. Basmann The Causal Interpretation of Non-Triangular Systems of Economic Relations , 1963 .

[45]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[46]  Roland G. Fryer An Empirical Analysis of Racial Differences in Police Use of Force , 2016, Journal of Political Economy.

[47]  E. Oster,et al.  Weighting for External Validity , 2017 .

[48]  Nancy Cartwright,et al.  Hunting Causes and Using Them: Against modularity, the causal Markov condition and any link between the two: comments on Hausman and Woodward , 2007 .

[49]  James Heckman,et al.  CAUSAL ANALYSIS AFTER HAAVELMO , 2013, Econometric Theory.

[50]  Jin Tian,et al.  Identification of Causal Effects in the Presence of Selection Bias , 2019, AAAI.

[51]  John A. List,et al.  Why Economists Should Conduct Field Experiments and 14 Tips for Pulling One Off , 2011 .

[52]  Jin Tian,et al.  Recovering Causal Effects from Selection Bias , 2015, AAAI.

[53]  Adam Glynn,et al.  Front-Door Difference-in-Differences Estimators , 2017 .

[54]  C. Manski Nonparametric Bounds on Treatment Effects , 1989 .

[55]  Elias Bareinboim,et al.  Causal Inference by Surrogate Experiments: z-Identifiability , 2012, UAI.

[56]  Herman Wold,et al.  The Fix-Point Approach to Interdependent Systems Review and Current Outlook , 1981 .

[57]  P. Schmidt,et al.  Limited-Dependent and Qualitative Variables in Econometrics. , 1984 .

[58]  Rachael Meager,et al.  Understanding the Average Impact of Microcredit Expansions: A Bayesian Hierarchical Analysis of Seven Randomized Experiments , 2019, American Economic Journal: Applied Economics.

[59]  Joshua D. Angrist,et al.  Mostly Harmless Econometrics: An Empiricist's Companion , 2008 .

[60]  E. Duflo,et al.  The Role of Information and Social Interactions in Retirement Plan Decisions: Evidence from a Randomized Experiment , 2002 .

[61]  Judea Pearl,et al.  Causal networks: semantics and expressiveness , 2013, UAI.

[62]  Elias Bareinboim,et al.  General Transportability - Synthesizing Observations and Experiments from Heterogeneous Domains , 2020, AAAI.

[63]  Judea Pearl,et al.  Identification of Joint Interventional Distributions in Recursive Semi-Markovian Causal Models , 2006, AAAI.

[64]  Rajeev Dehejia,et al.  From Local to Global: External Validity in a Fertility Natural Experiment , 2015, Journal of Business & Economic Statistics.

[65]  G. Ridder,et al.  The Econometrics of Data Combination , 2007 .

[66]  H. Wold Causality and Econometrics , 1954 .

[67]  D. Rubin,et al.  Causal Inference for Statistics, Social, and Biomedical Sciences: A General Method for Estimating Sampling Variances for Standard Estimators for Average Causal Effects , 2015 .

[68]  Elias Bareinboim,et al.  A General Algorithm for Deciding Transportability of Experimental Results , 2013, ArXiv.

[69]  Guido W. Imbens,et al.  Potential Outcome and Directed Acyclic Graph Approaches to Causality: Relevance for Empirical Practice in Economics , 2019, Journal of Economic Literature.

[70]  C. Manski Partial Identification of Probability Distributions , 2003 .

[71]  Elias Bareinboim,et al.  Controlling Selection Bias in Causal Inference , 2011, AISTATS.

[72]  V. J. Hotz,et al.  Predicting the efficacy of future training programs using past experiences at other locations , 2005 .

[73]  Arthur Lewbel,et al.  The Identification Zoo: Meanings of Identification in Econometrics , 2019 .

[74]  Johannes Textor,et al.  Complete Graphical Characterization and Construction of Adjustment Sets in Markov Equivalence Classes of Ancestral Graphs , 2016, J. Mach. Learn. Res..

[75]  J. Pearl TRYGVE HAAVELMO AND THE EMERGENCE OF CAUSAL CALCULUS , 2013, Econometric Theory.

[76]  Emi Nakamura,et al.  Identification in Macroeconomics , 2017, Journal of Economic Perspectives.

[77]  H. Wold,et al.  On statistical demand analysis from the viewpoint of simultaneous equations , 1946 .

[78]  G. Imbens Instrumental Variables: An Econometrician's Perspective , 2014, SSRN Electronic Journal.

[79]  Marco Valtorta,et al.  Pearl's Calculus of Intervention Is Complete , 2006, UAI.

[80]  Jin Tian,et al.  Recovering from Selection Bias in Causal and Statistical Inference , 2014, AAAI.

[81]  David M. Blei,et al.  Adapting Neural Networks for the Estimation of Treatment Effects , 2019, NeurIPS.

[82]  Jorge Martinez-Vazquez,et al.  Alternative value estimates of owner-occupied housing: Evidence on sample selection bias and systematic errors , 1986 .

[83]  J. Angrist,et al.  Identification and Estimation of Local Average Treatment Effects , 1995 .

[84]  Jiji Zhang,et al.  Causal Identification under Markov Equivalence: Completeness Results , 2019, ICML.

[85]  P. Spirtes,et al.  Causation, Prediction, and Search, 2nd Edition , 2001 .

[86]  J. Woodward Making Things Happen: A Theory of Causal Explanation , 2003 .

[87]  A. Wold,et al.  A GENERALIZATION OF CAUSAL CHAIN MODELS (PART III OF A TRIPTYCH ON CAUSAL CHAIN SYSTEMS) , 1960 .

[88]  Bernhard Schölkopf,et al.  Elements of Causal Inference: Foundations and Learning Algorithms , 2017 .

[89]  Elias Bareinboim,et al.  Causal inference and the data-fusion problem , 2016, Proceedings of the National Academy of Sciences.

[90]  J. Angrist Conditional Independence in Sample Selection Models , 1997 .

[91]  Elias Bareinboim,et al.  Identification and Model Testing in Linear Structural Equation Models using Auxiliary Variables , 2016, ICML.