High Dimensional Propensity Score Estimation via Covariate Balancing

In this paper, we address the problem of estimating the average treatment effect (ATE) and the average treatment effect for the treated (ATT) in observational studies when the number of potential confounders is possibly much greater than the sample size. In particular, we develop a robust method to estimate the propensity score via covariate balancing in high-dimensional settings. Since it is usually impossible to obtain the exact covariate balance in high dimension, we propose to estimate the propensity score by balancing a carefully selected subset of covariates that are predictive of the outcome under the assumption that the outcome model is linear and sparse. The estimated propensity score is, then, used for the Horvitz-Thompson estimator to infer the ATE and ATT. We prove that the proposed methodology has the desired properties such as sample boundedness, root-n consistency, asymptotic normality, and semiparametric efficiency. We then extend these results to the case where the outcome model is a sparse generalized linear model. In addition, we show that the proposed estimator remains root-n consistent and asymptotically normal even when the propensity score model is misspecified. Finally, we conduct simulation studies to evaluate the finite-sample performance of the proposed, and apply the proposed methodology to estimate the causal effects of college attendance on adulthood political participation. Open-source software is available for implementing the proposed methodology.

[1]  K. Imai,et al.  Covariate balancing propensity score , 2014 .

[2]  Tong Zhang,et al.  Analysis of Multi-stage Convex Relaxation for Sparse Regularization , 2010, J. Mach. Learn. Res..

[3]  M. Farrell Robust Inference on Average Treatment Effects with Possibly More Covariates than Observations , 2013, 1309.4686.

[4]  J. Zubizarreta Stable Weights that Balance Covariates for Estimation With Incomplete Outcome Data , 2015 .

[5]  J. Lunceford,et al.  Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study , 2004, Statistics in medicine.

[6]  Sara van de Geer,et al.  High-dimensional inference in misspecified linear models , 2015, 1503.06426.

[7]  Peter Bühlmann,et al.  p-Values for High-Dimensional Regression , 2008, 0811.2177.

[8]  Han Liu,et al.  A General Theory of Hypothesis Tests and Confidence Regions for Sparse High Dimensional Models , 2014, 1412.8765.

[9]  A. Belloni,et al.  Program evaluation with high-dimensional data , 2013 .

[10]  Liu Jingyuan,et al.  A selective overview of feature screening for ultrahigh-dimensional data , 2015, Science China Mathematics.

[11]  B. Graham,et al.  Inverse Probability Tilting for Moment Condition Models with Missing Data , 2008 .

[12]  W. Newey,et al.  Double machine learning for treatment and causal parameters , 2016 .

[13]  Peter Buhlmann Statistical significance in high-dimensional linear models , 2012, 1202.1377.

[14]  K. C. G. Chan,et al.  Globally efficient non‐parametric inference of average treatment effects by empirical balancing calibration weighting , 2016, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[15]  T. Speed,et al.  On the Application of Probability Theory to Agricultural Experiments. Essay on Principles. Section 9 , 1990 .

[16]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[17]  Cun-Hui Zhang,et al.  Confidence intervals for low dimensional parameters in high dimensional linear models , 2011, 1110.2563.

[18]  T. Tony Cai,et al.  Confidence intervals for high-dimensional linear regression: Minimax rates and adaptivity , 2015, 1506.05539.

[19]  Victor Chernozhukov,et al.  Post-Selection Inference for Generalized Linear Models With Many Controls , 2013, 1304.3969.

[20]  Chen Xu,et al.  The Sparse MLE for Ultrahigh-Dimensional Feature Screening , 2014, Journal of the American Statistical Association.

[21]  Judea Pearl,et al.  Causal Inference , 2010 .

[22]  J. Hahn On the Role of the Propensity Score in Efficient Semiparametric Estimation of Average Treatment Effects , 1998 .

[23]  Kosuke Imai,et al.  Causal Inference With General Treatment Regimes , 2004 .

[24]  Jens Hainmueller,et al.  Entropy Balancing for Causal Effects: A Multivariate Reweighting Method to Produce Balanced Samples in Observational Studies , 2012, Political Analysis.

[25]  Angie Wade Matched Sampling for Causal Effects , 2008 .

[26]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[27]  Jun Zhang,et al.  Robust rank correlation based screening , 2010, 1012.4255.

[28]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[29]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[30]  K. Imai,et al.  Robust Estimation of Inverse Probability Weights for Marginal Structural Models , 2015 .

[31]  L. Wasserman,et al.  HIGH DIMENSIONAL VARIABLE SELECTION. , 2007, Annals of statistics.

[32]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[33]  Joseph Kang,et al.  Demystifying Double Robustness: A Comparison of Alternative Strategies for Estimating a Population Mean from Incomplete Data , 2007, 0804.2958.

[34]  S. Geer,et al.  On asymptotically optimal confidence regions and tests for high-dimensional models , 2013, 1303.0518.

[35]  Wenjiang J. Fu,et al.  Asymptotics for lasso-type estimators , 2000 .

[36]  J. Robins,et al.  Comment: Performance of Double-Robust Estimators When “Inverse Probability” Weights Are Highly Variable , 2007, 0804.2965.

[37]  Karim Lounici Sup-norm convergence rate and sign concentration property of Lasso and Dantzig estimators , 2008, 0801.4610.

[38]  Jianqing Fan,et al.  Sure independence screening for ultrahigh dimensional feature space , 2006, math/0612857.

[39]  Chad Hazlett,et al.  Covariate balancing propensity score for a continuous treatment: Application to the efficacy of political advertisements , 2018 .

[40]  G. Imbens,et al.  Efficient Inference of Average Treatment Effects in High Dimensions via Approximate Residual Balancing , 2016 .

[41]  J. Robins,et al.  Estimation of Regression Coefficients When Some Regressors are not Always Observed , 1994 .

[42]  Roman Vershynin,et al.  Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.

[43]  Adel Javanmard,et al.  Confidence intervals and hypothesis testing for high-dimensional regression , 2013, J. Mach. Learn. Res..

[44]  G. Imbens The Role of the Propensity Score in Estimating Dose-Response Functions , 1999 .

[45]  D. Rubin For objective causal inference, design trumps analysis , 2008, 0811.1640.

[46]  J. Avorn,et al.  High-dimensional Propensity Score Adjustment in Studies of Treatment Effects Using Health Care Claims Data , 2009, Epidemiology.

[47]  Zhuoran Yang,et al.  On Semiparametric Exponential Family Graphical Models , 2014, J. Mach. Learn. Res..

[48]  Zhiqiang Tan,et al.  Bounded, efficient and doubly robust estimation with inverse weighting , 2010 .

[49]  D. Horvitz,et al.  A Generalization of Sampling Without Replacement from a Finite Universe , 1952 .

[50]  Jianqing Fan,et al.  Improving Covariate Balancing Propensity Score : A Doubly Robust and Efficient Approach ∗ , 2016 .