A Boosting Algorithm for Estimating Generalized Propensity Scores with Continuous Treatments

Abstract In this article, we study the causal inference problem with a continuous treatment variable using propensity score-based methods. For a continuous treatment, the generalized propensity score is defined as the conditional density of the treatment-level given covariates (confounders). The dose–response function is then estimated by inverse probability weighting, where the weights are calculated from the estimated propensity scores. When the dimension of the covariates is large, the traditional nonparametric density estimation suffers from the curse of dimensionality. Some researchers have suggested a two-step estimation procedure by first modeling the mean function. In this study, we suggest a boosting algorithm to estimate the mean function of the treatment given covariates. In boosting, an important tuning parameter is the number of trees to be generated, which essentially determines the trade-off between bias and variance of the causal estimator. We propose a criterion called average absolute correlation coefficient (AACC) to determine the optimal number of trees. Simulation results show that the proposed approach performs better than a simple linear approximation or L2 boosting. The proposed methodology is also illustrated through the Early Dieting in Girls study, which examines the influence of mothers’ overall weight concern on daughters’ dieting behavior.

[1]  M. Lechner Program Heterogeneity and Propensity Score Matching: An Application to the Evaluation of Active Labor Market Policies , 2002, Review of Economics and Statistics.

[2]  J. Schafer,et al.  Average causal effects from nonrandomized studies: a practical guide and simulated example. , 2008, Psychological methods.

[3]  Elizabeth A Stuart,et al.  Propensity score techniques and the assessment of measured covariate balance to test causal associations in psychological research. , 2010, Psychological methods.

[4]  F. Drasgow,et al.  The polyserial correlation coefficient , 1982 .

[5]  K. Imai,et al.  Covariate balancing propensity score , 2014 .

[6]  Sharon-Lise T Normand,et al.  On the use of discrete choice models for causal inference , 2005, Statistics in medicine.

[7]  P. Bühlmann,et al.  Boosting with the L2-loss: regression and classification , 2001 .

[8]  Jens Hainmueller,et al.  Entropy Balancing for Causal Effects: A Multivariate Reweighting Method to Produce Balanced Samples in Observational Studies , 2012, Political Analysis.

[9]  Ulf Olsson,et al.  Maximum likelihood estimation of the polychoric correlation coefficient , 1979 .

[10]  G Molenberghs,et al.  Model selection for incomplete and design‐based samples , 2006, Statistics in medicine.

[11]  Lane F Burgette,et al.  A tutorial on propensity score estimation for multiple treatments using generalized boosted models , 2013, Statistics in medicine.

[12]  C. Drake Effects of misspecification of the propensity score on estimators of treatment effect , 1993 .

[13]  Jianqing Fan,et al.  Estimation of conditional densities and sensitivity measures in nonlinear dynamical systems , 1996 .

[14]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[15]  J. Robins,et al.  Marginal Structural Models and Causal Inference in Epidemiology , 2000, Epidemiology.

[16]  Mark J. van der Laan,et al.  A semiparametric model selection criterion with applications to the marginal structural model , 2006, Comput. Stat. Data Anal..

[17]  Maria L. Rizzo,et al.  Measuring and testing dependence by correlation of distances , 2007, 0803.4101.

[18]  L. Birch,et al.  Eating in the absence of hunger and overweight in girls from 5 to 7 y of age. , 2002, The American journal of clinical nutrition.

[19]  Kosuke Imai,et al.  Causal Inference With General Treatment Regimes , 2004 .

[20]  B. Yandell Spline smoothing and nonparametric regression , 1989 .

[21]  G. Imbens,et al.  The Propensity Score with Continuous Treatments , 2005 .

[22]  Jacob Cohen Statistical Power Analysis for the Behavioral Sciences , 1969, The SAGE Encyclopedia of Research Design.

[23]  Zhong Zhao,et al.  Evaluating continuous training programmes by using the generalized propensity score , 2007 .

[24]  P. Bühlmann,et al.  Boosting With the L2 Loss , 2003 .

[25]  Eric T. Donnell,et al.  Causal inference in transportation safety studies: Comparison of potential outcomes and causal diagrams , 2011, 1107.4855.

[26]  James M. Robins,et al.  Association, Causation, And Marginal Structural Models , 1999, Synthese.

[27]  Maria L. Rizzo,et al.  Brownian distance covariance , 2009, 1010.0297.

[28]  D. McCaffrey,et al.  Propensity score estimation with boosted regression for evaluating causal effects in observational studies. , 2004, Psychological methods.

[29]  Rodney C. Wolff,et al.  Methods for estimating a conditional distribution function , 1999 .

[30]  J. Lunceford,et al.  Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study , 2004, Statistics in medicine.

[31]  M. C. Jones,et al.  Spline Smoothing and Nonparametric Regression. , 1989 .

[32]  D. Neumark-Sztainer,et al.  Dieting and unhealthy weight control behaviors during adolescence: associations with 10-year changes in body mass index. , 2012, The Journal of adolescent health : official publication of the Society for Adolescent Medicine.

[33]  G. Imbens The Role of the Propensity Score in Estimating Dose-Response Functions , 1999 .

[34]  Joseph Kang,et al.  Demystifying Double Robustness: A Comparison of Alternative Strategies for Estimating a Population Mean from Incomplete Data , 2007, 0804.2958.

[35]  L. Birch,et al.  Mothers' child-feeding practices influence daughters' eating and weight. , 2000, The American journal of clinical nutrition.

[36]  Stephen R Cole,et al.  An information criterion for marginal structural models , 2013, Statistics in medicine.

[37]  Jeffrey A. Smith,et al.  Bandwidth Selection and the Estimation of Treatment Effects with Unbalanced Data , 2007, SSRN Electronic Journal.