Optimal Covariate Balancing Conditions in Propensity Score Estimation

Inverse probability of treatment weighting (IPTW) is a popular method for estimating the average treatment effect (ATE). However, empirical studies show that the IPTW estimators can be sensitive to the misspecification of the propensity score model. To address this problem, researchers have proposed to estimate propensity score by directly optimizing the balance of pretreatment covariates. While these methods appear to empirically perform well, little is known about how the choice of balancing conditions affects their theoretical properties. To fill this gap, we first characterize the asymptotic bias and efficiency of the IPTW estimator based on the Covariate Balancing Propensity Score (CBPS) methodology under local model misspecification. Based on this analysis, we show how to optimally choose the covariate balancing functions and propose an optimal CBPS-based IPTW estimator. This estimator is doubly robust; it is consistent for the ATE if either the propensity score model or the outcome model is correct. In addition, the proposed estimator is locally semiparametric efficient when both models are correctly specified. To further relax the parametric assumptions, we extend our method by using a sieve estimation approach. We show that the resulting estimator is globally efficient ∗Supported by NSF grants DMS-1854637, DMS-1662139 and DMS-1712591 and NIH grant R01-GM072611-12. An earlier version of this paper is entitled, “Improving Covariate Balancing Propensity Score: A Doubly Robust and Efficient Approach.” †Department of Operations Research and Financial Engineering, Princeton University ‡Department of Government and Department of Statistics, Harvard University §Department of Electrical Engineering and Computer Science, Northwestern University. ¶Department of Statistics and Data Science, Cornell University ‖Amazon

[1]  Yang Ning,et al.  Robust Estimation of Causal Effects via High-Dimensional Covariate Balancing Propensity Score. , 2018, 1812.08683.

[2]  P. Bickel Efficient and Adaptive Estimation for Semiparametric Models , 1993 .

[3]  Qingyuan Zhao,et al.  Double Robustness for Causal Effects via Entropy Balancing , 2015 .

[4]  Stijn Vansteelandt,et al.  Bias-Reduced Doubly Robust Estimation , 2015 .

[5]  J. Robins,et al.  Doubly Robust Estimation in Missing Data and Causal Inference Models , 2005, Biometrics.

[6]  M. J. Laan,et al.  Doubly robust nonparametric inference on the average treatment effect , 2017, Biometrika.

[7]  Joseph Kang,et al.  Demystifying Double Robustness: A Comparison of Alternative Strategies for Estimating a Population Mean from Incomplete Data , 2007, 0804.2958.

[8]  Xiaotong Shen,et al.  Empirical Likelihood , 2002 .

[9]  T. Speed,et al.  On the Application of Probability Theory to Agricultural Experiments. Essay on Principles. Section 9 , 1990 .

[10]  J. Robins,et al.  Improved double-robust estimation in missing data and causal inference models. , 2012, Biometrika.

[11]  J. Robins,et al.  Comment: Performance of Double-Robust Estimators When “Inverse Probability” Weights Are Highly Variable , 2007, 0804.2965.

[12]  G. Imbens,et al.  Mean-Squared-Error Calculations for Average Treatment Effects , 2005 .

[13]  Kengo Kato,et al.  Some new asymptotic theory for least squares series: Pointwise and uniform results , 2012, 1212.0442.

[14]  G. Imbens,et al.  Efficient Estimation of Average Treatment Effects Using the Estimated Propensity Score , 2000 .

[15]  Qingyuan Zhao,et al.  Primal-dual Covariate Balance and Minimal Double Robustness via Entropy Balancing , 2015 .

[16]  S. Eguchi,et al.  A paradox concerning nuisance parameters and projected estimating functions , 2004 .

[17]  Shinto Eguchi,et al.  Local model uncertainty and incomplete‐data bias (with discussion) , 2005 .

[18]  M. Davidian,et al.  Improving efficiency and robustness of the doubly robust estimator for a population mean with incomplete data , 2009, Biometrika.

[19]  Jens Hainmueller,et al.  Entropy Balancing for Causal Effects: A Multivariate Reweighting Method to Produce Balanced Samples in Observational Studies , 2012, Political Analysis.

[20]  K. C. G. Chan,et al.  Globally efficient non‐parametric inference of average treatment effects by empirical balancing calibration weighting , 2016, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[21]  Mark J. van der Laan,et al.  Targeted Maximum Likelihood Based Causal Inference , 2010 .

[22]  Zhiqiang Tan,et al.  Bounded, efficient and doubly robust estimation with inverse weighting , 2010 .

[23]  D. Horvitz,et al.  A Generalization of Sampling Without Replacement from a Finite Universe , 1952 .

[24]  J. Zubizarreta Stable Weights that Balance Covariates for Estimation With Incomplete Outcome Data , 2015 .

[25]  Qingyuan Zhao Covariate balancing propensity score by tailored loss functions , 2016, The Annals of Statistics.

[26]  J. Robins,et al.  Estimation of Regression Coefficients When Some Regressors are not Always Observed , 1994 .

[27]  J. Robins,et al.  Analysis of semiparametric regression models for repeated outcomes in the presence of missing data , 1995 .

[28]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[29]  Alan R. Ellis,et al.  The role of prediction modeling in propensity score estimation: an evaluation of logistic regression, bCART, and the covariate-balancing propensity score. , 2014, American journal of epidemiology.

[30]  Ernst Wit,et al.  Local model uncertainty and incomplete-data bias , 2005 .

[31]  C. J. Stone,et al.  Additive Regression and Other Nonparametric Models , 1985 .

[32]  K. Imai,et al.  Robust Estimation of Inverse Probability Weights for Marginal Structural Models , 2015 .

[33]  C. Rothe,et al.  Semiparametric Estimation and Inference Using Doubly Robust Moment Conditions , 2013, SSRN Electronic Journal.

[34]  M. A. Arcones,et al.  A Bernstein-type inequality for U-statistics and U-processes , 1995 .

[35]  D. Andrews Asymptotic Normality of Series Estimators for Nonparametric and Semiparametric Regression Models , 1991 .

[36]  Chad Hazlett,et al.  Covariate balancing propensity score for a continuous treatment: Application to the efficacy of political advertisements , 2018 .

[37]  Zhiqiang Tan,et al.  A Distributional Approach for Causal Inference Using Propensity Scores , 2006 .

[38]  K. Imai,et al.  Covariate balancing propensity score , 2014 .

[39]  Jeffrey A. Smith,et al.  Does Matching Overcome Lalonde's Critique of Nonexperimental Estimators? , 2000 .

[40]  L. Hansen Large Sample Properties of Generalized Method of Moments Estimators , 1982 .

[41]  Jianqing Fan,et al.  Nonparametric Inferences for Additive Models , 2005 .

[42]  Geert Ridder,et al.  Mean-Square-Error Calculations for Average Treatment Effects , 2005 .

[43]  Manuel Wiesenfarth,et al.  The Finite Sample Performance of Semi- and Nonparametric Estimators for Treatment Effects and Policy Evaluation , 2017, Comput. Stat. Data Anal..

[44]  M. J. van der Laan Targeted Maximum Likelihood Based Causal Inference: Part I , 2010, The international journal of biostatistics.

[45]  R. Lalonde Evaluating the Econometric Evaluations of Training Programs with Experimental Data , 1984 .

[46]  G. Imbens,et al.  Large Sample Properties of Matching Estimators for Average Treatment Effects , 2004 .

[47]  B. Graham,et al.  Inverse Probability Tilting for Moment Condition Models with Missing Data , 2008 .

[48]  W. Newey,et al.  Convergence rates and asymptotic normality for series estimators , 1997 .

[49]  R. Tibshirani,et al.  Generalized Additive Models , 1986 .

[50]  Jon A. Wellner,et al.  Weak Convergence and Empirical Processes: With Applications to Statistics , 1996 .

[51]  Joel A. Tropp,et al.  An Introduction to Matrix Concentration Inequalities , 2015, Found. Trends Mach. Learn..

[52]  Lu Wang,et al.  Estimation with missing data: beyond double robustness , 2013 .

[53]  Joel L. Horowitz,et al.  Nonparametric estimation of an additive model with a link function , 2002, math/0508595.

[54]  Xiaohong Chen Chapter 76 Large Sample Sieve Estimation of Semi-Nonparametric Models , 2007 .

[55]  J. Hahn On the Role of the Propensity Score in Efficient Semiparametric Estimation of Average Treatment Effects , 1998 .

[56]  W. Newey,et al.  Large sample estimation and hypothesis testing , 1986 .

[57]  Biao Zhang,et al.  Empirical‐likelihood‐based inference in missing response problems and its application in observational studies , 2007 .