Essays on Matching and Weighting for Causal Inference in Observational Studies

Essays on Matching and Weighting for Causal Inference in Observational Studies Maŕıa de los Angeles Resa Juárez This thesis consists of three papers on matching and weighting methods for causal inference. The first paper conducts a Monte Carlo simulation study to evaluate the performance of multivariate matching methods that select a subset of treatment and control observations. The matching methods studied are the widely used nearest neighbor matching with propensity score calipers, and the more recently proposed methods, optimal matching of an optimally chosen subset and optimal cardinality matching. The main findings are: (i) covariate balance, as measured by differences in means, variance ratios, Kolmogorov-Smirnov distances, and cross-match test statistics, is better with cardinality matching since by construction it satisfies balance requirements; (ii) for given levels of covariate balance, the matched samples are larger with cardinality matching than with the other methods; (iii) in terms of covariate distances, optimal subset matching performs best; (iv) treatment effect estimates from cardinality matching have lower RMSEs, provided strong requirements for balance, specifically, fine balance, or strength-k balance, plus close mean balance. In standard practice, a matched sample is considered to be balanced if the absolute differences in means of the covariates across treatment groups are smaller than 0.1 standard deviations. However, the simulation results suggest that stronger forms of balance should be pursued in order to remove systematic biases due to observed covariates when a difference in means treatment effect estimator is used. In particular, if the true outcome model is additive then marginal distributions should be balanced, and if the true outcome model is additive with interactions then low-dimensional joints should be balanced. The second paper focuses on longitudinal studies, where marginal structural models (MSMs) are widely used to estimate the effect of time-dependent treatments in the presence of time-dependent confounders. Under a sequential ignorability assumption, MSMs yield unbiased treatment effect estimates by weighting each observation by the inverse of the probability of their observed treatment sequence given their history of observed covariates. However, these probabilities are typically estimated by fitting a propensity score model, and the resulting weights can fail to adjust for observed covariates due to model misspecification. Also, these weights tend to yield very unstable estimates if the predicted probabilities of treatment are very close to zero, which is often the case in practice. To address both of these problems, instead of modeling the probabilities of treatment, a design-based approach is taken and weights of minimum variance that adjust for the covariates across all possible treatment histories are directly found. For this, the role of weighting in longitudinal studies of treatment effects is analyzed, and a convex optimization problem that can be solved efficiently is defined. Unlike standard methods, this approach makes evident to the investigator the limitations imposed by the data when estimating causal effects without extrapolating. A simulation study shows that this approach outperforms standard methods, providing less biased and more precise estimates of time-varying treatment effects in a variety of settings. The proposed method is used on Chilean educational data to estimate the cumulative effect of attending a private subsidized school, as opposed to a public school, on students’ university admission tests scores. The third paper is centered on observational studies with multi-valued treatments. Generalizing methods for matching and stratifying to accommodate multi-valued treatments has proven to be a complex task. A natural way to address confounding in this case is by weighting the observations, typically by the inverse probability of treatment weights (IPTW). As in the MSMs case, these weights can be highly variable and produce unstable estimates due to extreme weights. In addition, model misspecification, small sample sizes, and truncation of extreme weights can cause the weights to fail to adjust appropriately for observed confounders. The conditions the weights need to satisfy in order to provide close to unbiased treatment effect estimates with a reduced variability are determined and the convex optimization problem that can be solved in polynomial time to obtain them is defined. A simulation study with different settings is conducted to compare the proposed weighting scheme to IPTW, including generalized propensity score estimation methods that also consider explicitly the covariate balance problem in the probability estimation process. The applicability of the methods to continuous treatments is also tested. The results show that directly targeting balance with the weights, instead of focusing on estimating treatment assignment probabilities, provides the best results in terms of bias and root mean square error of the treatment effect estimator. The effects of the intensity level of the 2010 Chilean earthquake on posttraumatic stress disorder are estimated using the proposed methodology.

[1]  D. Rubin,et al.  Constructing a Control Group Using Multivariate Matched Sampling Methods That Incorporate the Propensity Score , 1985 .

[2]  J. Zubizarreta Stable Weights that Balance Covariates for Estimation With Incomplete Outcome Data , 2015 .

[3]  Lane F Burgette,et al.  A tutorial on propensity score estimation for multiple treatments using generalized boosted models , 2013, Statistics in medicine.

[4]  D. Rubin Matched Sampling for Causal Effects: Matching to Remove Bias in Observational Studies , 1973 .

[5]  T. Shakespeare,et al.  Observational Studies , 2003 .

[6]  Bernardita Vial,et al.  Private vs Public Voucher Schools in Chile: new Evidence on Efficiency and Peer Effects , 2005 .

[7]  J. Robins,et al.  Marginal Structural Models and Causal Inference in Epidemiology , 2000, Epidemiology.

[8]  Bernardo Lara,et al.  The Effectiveness of Private Voucher Education , 2011 .

[9]  Paul R. Rosenbaum,et al.  Matching for Balance, Pairing for Heterogeneity in an Observational Study of the Effectiveness of For-Profit and Not-For-Profit High Schools in Chile , 2014, 1404.3584.

[10]  C. Blumberg Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction , 2016 .

[11]  D. Rubin,et al.  The Bias Due to Incomplete Matching , 1985 .

[12]  J. Davidson,et al.  Assessment of a new self-rating scale for post-traumatic stress disorder , 1997, Psychological Medicine.

[13]  Rocío Titiunik,et al.  Enhancing a geographic regression discontinuity design through matching to estimate the effect of ballot initiatives on voter turnout , 2015 .

[14]  Jake Bowers,et al.  Covariate balance in simple stratified and clustered comparative studies , 2008, 0808.3857.

[15]  Richard K. Crump,et al.  Dealing with limited overlap in estimation of average treatment effects , 2009 .

[16]  F. Norris,et al.  60,000 Disaster Victims Speak: Part I. An Empirical Review of the Empirical Literature, 1981—2001 , 2002, Psychiatry.

[17]  D. Rubin Estimating causal effects of treatments in randomized and nonrandomized studies. , 1974 .

[18]  M. Lechner Identification and Estimation of Causal Effects of Multiple Treatments Under the Conditional Independence Assumption , 1999, SSRN Electronic Journal.

[19]  Paul R. Rosenbaum,et al.  Heterogeneity and Causality , 2005 .

[20]  David K. Smith Network Flows: Theory, Algorithms, and Applications , 1994 .

[21]  Dylan S. Small,et al.  Split Samples and Design Sensitivity in Observational Studies , 2009 .

[22]  Elizabeth A Stuart,et al.  Matching methods for causal inference: A review and a look forward. , 2010, Statistical science : a review journal of the Institute of Mathematical Statistics.

[23]  Dylan S. Small,et al.  Strong Control of the Familywise Error Rate in Observational Studies that Discover Effect Modification by Exploratory Methods , 2015 .

[24]  Dylan S. Small,et al.  Using Split Samples and Evidence Factors in an Observational Study of Neonatal Outcomes , 2011 .

[25]  Bernardita Vial,et al.  THE PERFORMANCE OF PRIVATE AND PUBLIC SCHOOLS IN THE CHILEAN VOUCHER SYSTEM , 2002 .

[26]  Dylan S. Small,et al.  Using the Cross-Match Test to Appraise Covariate Balance in Matched Pairs , 2010 .

[27]  G. King,et al.  Causal Inference without Balance Checking: Coarsened Exact Matching , 2012, Political Analysis.

[28]  Dylan S. Small,et al.  Discrete Optimization for Interpretable Study Populations and Randomization Inference in an Observational Study of Severe Sepsis Mortality , 2014, 1411.4873.

[29]  John W Jackson,et al.  Diagnostics for Confounding of Time-varying and Other Joint Exposures. , 2016, Epidemiology.

[30]  John N. Tsitsiklis,et al.  Introduction to linear optimization , 1997, Athena scientific optimization and computation series.

[31]  E. Stuart,et al.  Misunderstandings among Experimentalists and Observationalists about Causal Inference , 2007 .

[32]  H. Kuhn The Hungarian method for the assignment problem , 1955 .

[33]  P. Rosenbaum Design of Observational Studies , 2009, Springer Series in Statistics.

[34]  Jennifer Hill,et al.  Discussion of research using propensity‐score matching: Comments on ‘A critical appraisal of propensity‐score matching in the medical literature between 1996 and 2003’ by Peter Austin, Statistics in Medicine , 2008, Statistics in medicine.

[35]  J. Robins A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect , 1986 .

[36]  Paul R. Rosenbaum,et al.  Comparison of Multivariate Matching Methods: Structures, Distances, and Algorithms , 1993 .

[37]  Guido W. Imbens,et al.  Matching Methods in Practice: Three Examples , 2014, The Journal of Human Resources.

[38]  P. Rosenbaum An exact distribution‐free test comparing two multivariate distributions based on adjacency , 2005 .

[39]  Chang-Tai Hsieh,et al.  The effects of generalized school choice on achievement and stratification: Evidence from Chile's voucher program , 2006 .

[40]  Shu Yang,et al.  Propensity score matching and subclassification in observational studies with multi‐level treatments , 2015, Biometrics.

[41]  Dylan S. Small,et al.  Defining the Study Population for an Observational Study to Ensure Sufficient Overlap: A Tree Approach , 2011 .

[42]  G. Imbens,et al.  The Propensity Score with Continuous Treatments , 2005 .

[43]  K. Imai,et al.  Covariate balancing propensity score , 2014 .

[44]  Elizabeth,et al.  Matching Methods for Causal Inference , 2007 .

[45]  Jeremy A Rassen,et al.  Metrics for covariate balance in cohort studies of causal effects , 2014, Statistics in medicine.

[46]  Paul R. Rosenbaum,et al.  Optimal Matching for Observational Studies , 1989 .

[47]  D. McCaffrey,et al.  Propensity score estimation with boosted regression for evaluating causal effects in observational studies. , 2004, Psychological methods.

[48]  Magdalena Cerdá,et al.  Effect of the 2010 Chilean Earthquake on Posttraumatic Stress: Reducing Sensitivity to Unmeasured Bias Through Study Design , 2013, Epidemiology.

[49]  G. Imbens,et al.  Approximate residual balancing: debiased inference of average treatment effects in high dimensions , 2016, 1604.07125.

[50]  Christopher Lucas,et al.  The Balance-Sample Size Frontier in Matching Methods for Causal Inference , 2016 .

[51]  W. G. Cochran,et al.  Controlling Bias in Observational Studies: A Review. , 1974 .

[52]  P D Cleary,et al.  Validating recommendations for coronary angiography following acute myocardial infarction in the elderly: a matched analysis using propensity scores. , 2001, Journal of clinical epidemiology.

[53]  Fan Yang,et al.  Dissonant Conclusions When Testing the Validity of an Instrumental Variable , 2014 .

[54]  P. Rosenbaum,et al.  Minimum Distance Matched Sampling With Fine Balance in an Observational Study of Treatment for Ovarian Cancer , 2007 .

[55]  G. Imbens The Role of the Propensity Score in Estimating Dose-Response Functions , 1999 .

[56]  Paul R. Rosenbaum,et al.  Optimal Matching of an Optimally Chosen Subset in Observational Studies , 2012 .

[57]  D. Rubin Statistics and Causal Inference: Comment: Which Ifs Have Causal Answers , 1986 .

[58]  Donald B. Rubin,et al.  Bayesian Inference for Causal Effects: The Role of Randomization , 1978 .

[59]  A. Repetto,et al.  Using School Scholarships to Estimate the Effect of Government Subsidized Private Education on Academic Achievement in Chile , 2006 .

[60]  P. McEwan The Effectiveness of Public, Catholic, and Non-Religious Private Schools in Chile's Voucher System , 2001 .

[61]  P. Austin,et al.  Optimal caliper widths for propensity-score matching when estimating differences in means and differences in proportions in observational studies , 2010, Pharmaceutical statistics.

[62]  Peter C Austin,et al.  A comparison of 12 algorithms for matching on the propensity score , 2013, Statistics in medicine.

[63]  Paul R Rosenbaum,et al.  Combining propensity score matching and group-based trajectory analysis in an observational study. , 2007, Psychological methods.

[64]  C. Glymour,et al.  STATISTICS AND CAUSAL INFERENCE , 1985 .

[65]  Christos H. Papadimitriou,et al.  Computational complexity , 1993 .

[66]  Elizabeth A Stuart,et al.  Improving propensity score weighting using machine learning , 2010, Statistics in medicine.

[67]  P. Rosenbaum The Consequences of Adjustment for a Concomitant Variable that Has Been Affected by the Treatment , 1984 .

[68]  Rajeev Dehejia,et al.  Propensity Score-Matching Methods for Nonexperimental Causal Studies , 2002, Review of Economics and Statistics.

[69]  Frederick Mosteller,et al.  Planning and Analysis of Observational Studies. , 1983 .

[70]  B. Hansen,et al.  Optimal Full Matching and Related Designs via Network Flows , 2006 .

[71]  Chad Hazlett,et al.  Covariate balancing propensity score for a continuous treatment: Application to the efficacy of political advertisements , 2018 .

[72]  Dimitri P. Bertsekas,et al.  A new algorithm for the assignment problem , 1981, Math. Program..

[73]  D. Rubin For objective causal inference, design trumps analysis , 2008, 0811.1640.

[74]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[75]  David A. Lane Randomization Analysis of Experimental Data: The Fisher Randomization Test Comment , 1980 .

[76]  Peter C Austin,et al.  Some Methods of Propensity‐Score Matching had Superior Performance to Others: Results of an Empirical Investigation and Monte Carlo simulations , 2009, Biometrical journal. Biometrische Zeitschrift.

[77]  W. J. Langford Statistical Methods , 1959, Nature.

[78]  Michael J. Lopez,et al.  Estimation of causal effects with multiple treatments: a review and new ideas , 2017, 1701.05132.

[79]  K. Imai,et al.  Robust Estimation of Inverse Probability Weights for Marginal Structural Models , 2015 .

[80]  Luke Keele,et al.  Optimal Multilevel Matching in Clustered Observational Studies: A Case Study of the Effectiveness of Private Schools Under a Large-Scale Voucher System , 2014, 1409.8597.

[81]  J. Zubizarreta Journal of the American Statistical Association Using Mixed Integer Programming for Matching in an Observational Study of Kidney Failure after Surgery Using Mixed Integer Programming for Matching in an Observational Study of Kidney Failure after Surgery , 2022 .