Using Machine Learning Methods to Support Causal Inference in Econometrics

We provide an introduction to the use of machine learning methods in econometrics and how these methods can be employed to assist in causal inference. We begin with an extended presentation of the lasso (least absolute shrinkage and selection operator) of Tibshirani [50]. We then discuss the ‘Post-Double-Selection’ (PDS) estimator of Belloni et al. [13, 19] and show how it uses the lasso to address the omitted confounders problem. The PDS methodology is particularly powerful for the case where the researcher has a high-dimensional set of potential control variables, and needs to strike a balance between using enough controls to eliminate the omitted variable bias but not so many as to induce overfitting. The last part of the paper discusses recent developments in the field that go beyond the PDS approach.

[1]  T. Speed,et al.  On the Application of Probability Theory to Agricultural Experiments. Essay on Principles. Section 9 , 1990 .

[2]  Kirk Bansak,et al.  Improving refugee integration through data-driven algorithmic assignment , 2018, Science.

[3]  Aad van der Vaart,et al.  The Cross-Validated Adaptive Epsilon-Net Estimator , 2006 .

[4]  A. Belloni,et al.  SPARSE MODELS AND METHODS FOR OPTIMAL INSTRUMENTS WITH AN APPLICATION TO EMINENT DOMAIN , 2012 .

[5]  R. Fisher Statistical methods for research workers , 1927, Protoplasma.

[6]  A. Belloni,et al.  Least Squares After Model Selection in High-Dimensional Sparse Models , 2009, 1001.0188.

[7]  Christian Hansen,et al.  High-Dimensional Methods and Inference on Structural and Treatment Effects , 2013 .

[8]  Joseph G. Altonji,et al.  Small Sample Bias in GMM Estimation of Covariance Structures , 1994 .

[9]  Matt Taddy,et al.  Measuring Group Differences in High‐Dimensional Choices: Method and Application to Congressional Speech , 2019, Econometrica.

[10]  Christian Hansen,et al.  Post-Selection and Post-Regularization Inference in Linear Models with Many Controls and Instruments , 2015, 1501.03185.

[11]  A. V. D. Vaart,et al.  Asymptotic Statistics: Frontmatter , 1998 .

[12]  A. Belloni,et al.  Inference for High-Dimensional Sparse Econometric Models , 2011, 1201.0220.

[13]  Bing-Yi Jing,et al.  Self-normalized Cramér-type large deviations for independent random variables , 2003 .

[14]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[15]  Sendhil Mullainathan,et al.  Machine Learning: An Applied Econometric Approach , 2017, Journal of Economic Perspectives.

[16]  J. Wooldridge VIOLATING IGNORABILITY OF TREATMENT BY CONTROLLING FOR TOO MANY FACTORS , 2005, Econometric Theory.

[17]  R. Backhouse,et al.  The Age of the Applied Economist: The Transformation of Economics Since the 1970s , 2016 .

[18]  Stefan Wager,et al.  Adaptive Concentration of Regression Trees, with Application to Random Forests , 2015 .

[19]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[20]  J. Robins,et al.  Estimation of Regression Coefficients When Some Regressors are not Always Observed , 1994 .

[21]  Fabian J. Theis,et al.  TREVOR HASTIE, ROBERT TIBSHIRANI, AND MARTIN WAINWRIGHT. Statistical Learning with Sparsity: The Lasso and Generalizations. Boca Raton: CRC Press. , 2018, Biometrics.

[22]  Yang Ning,et al.  Robust Estimation of Causal Effects via High-Dimensional Covariate Balancing Propensity Score. , 2018, 1812.08683.

[23]  A. Deaton,et al.  Understanding and Misunderstanding Randomized Controlled Trials , 2016, Social science & medicine.

[24]  J. Hahn On the Role of the Propensity Score in Efficient Semiparametric Estimation of Average Treatment Effects , 1998 .

[25]  A. Belloni,et al.  Inference on Treatment Effects after Selection Amongst High-Dimensional Controls , 2011, 1201.0224.

[26]  Susan Athey,et al.  Machine Learning Methods That Economists Should Know About , 2019, Annual Review of Economics.

[27]  Edward H. Kennedy Semiparametric theory and empirical processes in causal inference , 2015, 1510.04740.

[28]  Jeffrey M. Wooldridge,et al.  Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data , 2003 .

[29]  D. Horvitz,et al.  A Generalization of Sampling Without Replacement from a Finite Universe , 1952 .

[30]  G. Imbens,et al.  Approximate residual balancing: debiased inference of average treatment effects in high dimensions , 2016, 1604.07125.

[31]  Joshua D. Angrist,et al.  Split-Sample Instrumental Variables Estimates of the Return to Schooling , 1995 .

[32]  Leif D. Nelson,et al.  False-Positive Psychology , 2011, Psychological science.

[33]  Susan Athey,et al.  The State of Applied Econometrics - Causality and Policy Evaluation , 2016, 1607.00699.

[34]  D. Hamermesh Six Decades of Top Economics Publishing: Who and How? , 2012 .

[35]  W. J. Hall,et al.  Information and Asymptotic Efficiency in Parametric-Nonparametric Models , 1983 .

[36]  Sylvain Arlot,et al.  A survey of cross-validation procedures for model selection , 2009, 0907.4728.

[37]  Victor Chernozhukov,et al.  High Dimensional Sparse Econometric Models: An Introduction , 2011, 1106.5242.

[38]  Pierre Azoulay,et al.  Economic Research Evolves: Fields and Styles , 2017 .

[39]  A. Belloni,et al.  Program evaluation and causal inference with high-dimensional data , 2013, 1311.2645.

[40]  Stefan Wager,et al.  Estimation and Inference of Heterogeneous Treatment Effects using Random Forests , 2015, Journal of the American Statistical Association.

[41]  Joshua D. Angrist,et al.  The Credibility Revolution in Empirical Economics: How Better Research Design is Taking the Con Out of Econometrics , 2010, SSRN Electronic Journal.

[42]  Matt Taddy,et al.  Text As Data , 2017, Journal of Economic Literature.

[43]  Mark E. Schaffer,et al.  lassopack: Model selection and prediction with regularized regression in Stata , 2019, 1901.05397.

[44]  James L. Powell,et al.  Estimation of semiparametric models , 1994 .

[45]  Katherine A. Kiel,et al.  House Prices during Siting Decision Stages: The Case of an Incinerator from Rumor through Operation , 1995 .

[46]  James J. Feigenbaum,et al.  Automated Census Record Linking: A Machine Learning Approach , 2016 .

[47]  J. Robins,et al.  Double/Debiased Machine Learning for Treatment and Structural Parameters , 2017 .

[48]  Victor Chernozhukov,et al.  On cross-validated Lasso in high dimensions , 2020 .

[49]  Christian Hansen,et al.  Inference in High-Dimensional Panel Models With an Application to Gun Control , 2014, 1411.6507.