Robust Estimation of Causal Effects via High-Dimensional Covariate Balancing Propensity Score.

We propose a robust method to estimate the average treatment effects in observational studies when the number of potential confounders is possibly much greater than the sample size. Our method consists of three steps. We first use a class of penalized $M$-estimators for the propensity score and outcome models. We then calibrate the initial estimate of the propensity score by balancing a carefully selected subset of covariates that are predictive of the outcome. Finally, the estimated propensity score is used to construct the inverse probability weighting estimator. We prove that the proposed estimator, which we call the high-dimensional covariate balancing propensity score, has the sample boundedness property, is root-$n$ consistent, asymptotically normal, and semiparametrically efficient when the propensity score model is correctly specified and the outcome model is linear in covariates. More importantly, we show that our estimator remains root-$n$ consistent and asymptotically normal so long as either the propensity score model or the outcome model is correctly specified. We provide valid confidence intervals in both cases and further extend these results to the case where the outcome model is a generalized linear model. In simulation studies, we find that the proposed methodology often estimates the average treatment effect more accurately than existing methods. We also present an empirical application, in which we estimate the average causal effect of college attendance on adulthood political participation. An open-source software package is available for implementing the proposed methodology.

[1]  Cindy D. Kam,et al.  Reconsidering the Effects of Education on Political Participation , 2008, The Journal of Politics.

[2]  Stefan Wager,et al.  Augmented minimax linear estimation , 2017, The Annals of Statistics.

[3]  J. Lunceford,et al.  Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study , 2004, Statistics in medicine.

[4]  G. Imbens,et al.  Mean-Squared-Error Calculations for Average Treatment Effects , 2005 .

[5]  B. Graham,et al.  Inverse Probability Tilting for Moment Condition Models with Missing Data , 2008 .

[6]  Marie Davidian,et al.  Comment: Demystifying Double Robustness: A Comparison of Alternative Strategies for Estimating a Population Mean from Incomplete Data. , 2008, Statistical science : a review journal of the Institute of Mathematical Statistics.

[7]  J. Robins,et al.  Double/Debiased Machine Learning for Treatment and Causal Parameters , 2016, 1608.00060.

[8]  Stijn Vansteelandt,et al.  Bias-Reduced Doubly Robust Estimation , 2015 .

[9]  J. Hahn On the Role of the Propensity Score in Efficient Semiparametric Estimation of Average Treatment Effects , 1998 .

[10]  Victor Chernozhukov,et al.  Post-Selection Inference for Generalized Linear Models With Many Controls , 2013, 1304.3969.

[11]  J. Zubizarreta Stable Weights that Balance Covariates for Estimation With Incomplete Outcome Data , 2015 .

[12]  Angie Wade Matched Sampling for Causal Effects , 2008 .

[13]  J. Avorn,et al.  High-dimensional Propensity Score Adjustment in Studies of Treatment Effects Using Health Care Claims Data , 2009, Epidemiology.

[14]  Han Liu,et al.  A Unified Theory of Confidence Regions and Testing for High-Dimensional Estimating Equations , 2015, Statistical Science.

[15]  M. Farrell Robust Inference on Average Treatment Effects with Possibly More Covariates than Observations , 2013, 1309.4686.

[16]  Zhiqiang Tan,et al.  Regularized calibrated estimation of propensity scores with model misspecification and high-dimensional data , 2017, Biometrika.

[17]  Yang Ning,et al.  Efficient augmentation and relaxation learning for individualized treatment rules using observational data , 2019, J. Mach. Learn. Res..

[18]  K. Imai,et al.  Robust Estimation of Inverse Probability Weights for Marginal Structural Models , 2015 .

[19]  Cun-Hui Zhang,et al.  Confidence intervals for low dimensional parameters in high dimensional linear models , 2011, 1110.2563.

[20]  J. Robins,et al.  Estimation of Regression Coefficients When Some Regressors are not Always Observed , 1994 .

[21]  Chad Hazlett,et al.  Covariate balancing propensity score for a continuous treatment: Application to the efficacy of political advertisements , 2018 .

[22]  Jianqing Fan,et al.  Improving Covariate Balancing Propensity Score : A Doubly Robust and Efficient Approach ∗ , 2016 .

[23]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[24]  T. Speed,et al.  On the Application of Probability Theory to Agricultural Experiments. Essay on Principles. Section 9 , 1990 .

[25]  James M. Robins,et al.  A unifying approach for doubly-robust $\ell_1$ regularized estimation of causal contrasts , 2019, 1904.03737.

[26]  S. Geer,et al.  On asymptotically optimal confidence regions and tests for high-dimensional models , 2013, 1303.0518.

[27]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[28]  Adel Javanmard,et al.  Confidence intervals and hypothesis testing for high-dimensional regression , 2013, J. Mach. Learn. Res..

[29]  Whitney K. Newey,et al.  Cross-fitting and fast remainder rates for semiparametric estimation , 2017, 1801.09138.

[30]  J. Robins,et al.  Comment: Performance of Double-Robust Estimators When “Inverse Probability” Weights Are Highly Variable , 2007, 0804.2965.

[31]  K. C. G. Chan,et al.  Globally efficient non‐parametric inference of average treatment effects by empirical balancing calibration weighting , 2016, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[32]  Yinchu Zhu,et al.  Linear Hypothesis Testing in Dense High-Dimensional Linear Models , 2016, Journal of the American Statistical Association.

[33]  Greg Kochanski,et al.  Confidence Intervals and Hypothesis Testing. ∗ 1 What is a Hypothesis Test , 2022 .

[34]  R. W. Wedderburn Quasi-likelihood functions, generalized linear models, and the Gauss-Newton method , 1974 .

[35]  D. Basu,et al.  An Essay on the Logical Foundations of Survey Sampling, Part One* , 2011 .

[36]  Victor Chernozhukov,et al.  On cross-validated Lasso , 2016 .

[37]  M. Davidian,et al.  Improving efficiency and robustness of the doubly robust estimator for a population mean with incomplete data , 2009, Biometrika.

[38]  A. Belloni,et al.  Program evaluation with high-dimensional data , 2013 .

[39]  Geert Ridder,et al.  Mean-Square-Error Calculations for Average Treatment Effects , 2005 .

[40]  W. Newey,et al.  Double machine learning for treatment and causal parameters , 2016 .

[41]  S. Vansteelandt,et al.  Doubly robust tests of exposure effects under high‐dimensional confounding , 2018, Biometrics.

[42]  Jinyong Hahn,et al.  Functional Restriction and Efficiency in Causal Inference , 2004, Review of Economics and Statistics.

[43]  Sara van de Geer,et al.  High-dimensional inference in misspecified linear models , 2015, 1503.06426.

[44]  K. Imai,et al.  Covariate balancing propensity score , 2014 .

[45]  A. Belloni,et al.  Least Squares After Model Selection in High-Dimensional Sparse Models , 2009, 1001.0188.

[46]  M. J. Laan,et al.  Targeted Learning: Causal Inference for Observational and Experimental Data , 2011 .

[47]  D. Horvitz,et al.  A Generalization of Sampling Without Replacement from a Finite Universe , 1952 .

[48]  J. Robins,et al.  Doubly Robust Estimation in Missing Data and Causal Inference Models , 2005, Biometrics.

[49]  Zhiqiang Tan,et al.  Bounded, efficient and doubly robust estimation with inverse weighting , 2010 .

[50]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[51]  T. Tony Cai,et al.  Confidence intervals for high-dimensional linear regression: Minimax rates and adaptivity , 2015, 1506.05539.

[52]  G. Imbens The Role of the Propensity Score in Estimating Dose-Response Functions , 1999 .

[53]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[54]  Matías Busso,et al.  New Evidence on the Finite Sample Properties of Propensity Score Reweighting and Matching Estimators , 2014, Review of Economics and Statistics.

[55]  Mark J. van der Laan,et al.  Cross-Validated Targeted Minimum-Loss-Based Estimation , 2011 .

[56]  Qingyuan Zhao Covariate balancing propensity score by tailored loss functions , 2016, The Annals of Statistics.

[57]  Stijn Vansteelandt,et al.  High-dimensional doubly robust tests for regression parameters , 2018, 1805.06714.

[58]  Tianqi Zhao,et al.  A Likelihood Ratio Framework for High Dimensional Semiparametric Regression , 2014 .

[59]  G. Imbens,et al.  Efficient Inference of Average Treatment Effects in High Dimensions via Approximate Residual Balancing , 2016 .

[60]  Zhiqiang Tan,et al.  Model-assisted inference for treatment effects using regularized calibrated estimation with high-dimensional data , 2018, The Annals of Statistics.

[61]  A. Belloni,et al.  Program evaluation and causal inference with high-dimensional data , 2013, 1311.2645.

[62]  Stefan Wager,et al.  Sparsity Double Robust Inference of Average Treatment Effects , 2019, 1905.00744.

[63]  Kosuke Imai,et al.  Causal Inference With General Treatment Regimes , 2004 .

[64]  Peng Zhao,et al.  On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..

[65]  Han Liu,et al.  A General Theory of Hypothesis Tests and Confidence Regions for Sparse High Dimensional Models , 2014, 1412.8765.

[66]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[67]  G. Imbens,et al.  Approximate residual balancing: debiased inference of average treatment effects in high dimensions , 2016, 1604.07125.

[68]  D. Rubin For objective causal inference, design trumps analysis , 2008, 0811.1640.

[69]  Jens Hainmueller,et al.  Entropy Balancing for Causal Effects: A Multivariate Reweighting Method to Produce Balanced Samples in Observational Studies , 2012, Political Analysis.

[70]  D. Rubin [On the Application of Probability Theory to Agricultural Experiments. Essay on Principles. Section 9.] Comment: Neyman (1923) and Causal Inference in Experiments and Observational Studies , 1990 .

[71]  A. Belloni,et al.  Inference on Treatment Effects after Selection Amongst High-Dimensional Controls , 2011, 1201.0224.