Estimation of causal effects of binary treatments in unconfounded studies

Estimation of causal effects in non-randomized studies comprises two distinct phases: design, without outcome data, and analysis of the outcome data according to a specified protocol. Recently, Gutman and Rubin (2013) proposed a new analysis-phase method for estimating treatment effects when the outcome is binary and there is only one covariate, which viewed causal effect estimation explicitly as a missing data problem. Here, we extend this method to situations with continuous outcomes and multiple covariates and compare it with other commonly used methods (such as matching, subclassification, weighting, and covariance adjustment). We show, using an extensive simulation, that of all methods considered, and in many of the experimental conditions examined, our new 'multiple-imputation using two subclassification splines' method appears to be the most efficient and has coverage levels that are closest to nominal. In addition, it can estimate finite population average causal effects as well as non-linear causal estimands. This type of analysis also allows the identification of subgroups of units for which the effect appears to be especially beneficial or harmful.

[1]  D. Rubin,et al.  Estimation of causal effects of binary treatments in unconfounded studies with one continuous covariate , 2017, Statistical methods in medical research.

[2]  D. Rubin [On the Application of Probability Theory to Agricultural Experiments. Essay on Principles. Section 9.] Comment: Neyman (1923) and Causal Inference in Experiments and Observational Studies , 1990 .

[3]  K. Imai,et al.  Covariate balancing propensity score , 2014 .

[4]  T. Speed,et al.  On the Application of Probability Theory to Agricultural Experiments. Essay on Principles. Section 9 , 1990 .

[5]  R. Little,et al.  Robust Likelihood-based Analysis of Multivariate Data with Missing Values , 2003 .

[6]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[7]  Donald B. Rubin,et al.  Multivariate matching methods that are equal percent bias reducing , 1974 .

[8]  D. Basu Randomization Analysis of Experimental Data: The Fisher Randomization Test , 1980 .

[9]  Elizabeth A Stuart,et al.  Matching methods for causal inference: A review and a look forward. , 2010, Statistical science : a review journal of the Institute of Mathematical Statistics.

[10]  Richard K. Crump,et al.  Dealing with limited overlap in estimation of average treatment effects , 2009 .

[11]  Donald B. Rubin,et al.  Affinely Invariant Matching Methods with Ellipsoidal Distributions , 1992 .

[12]  Roger A. Sugden,et al.  Multiple Imputation for Nonresponse in Surveys , 1988 .

[13]  G. Imbens,et al.  Large Sample Properties of Matching Estimators for Average Treatment Effects , 2004 .

[14]  D. Rubin,et al.  Reducing Bias in Observational Studies Using Subclassification on the Propensity Score , 1984 .

[15]  Donald B. Rubin,et al.  Statistical Matching Using File Concatenation With Adjusted Weights and Multiple Imputations , 1986 .

[16]  D. Rubin,et al.  Small-sample degrees of freedom with multiple imputation , 1999 .

[17]  D. Rubin,et al.  Using Multivariate Matched Sampling and Regression Adjustment to Control Bias in Observational Studies , 1978 .

[18]  A. Azzalini,et al.  Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t‐distribution , 2003, 0911.2342.

[19]  Donald B. Rubin,et al.  Bayesian Inference for Causal Effects: The Role of Randomization , 1978 .

[20]  Guangyu Zhang,et al.  Extensions of the Penalized Spline of Propensity Prediction Method of Imputation , 2009, Biometrics.

[21]  D. Rubin Randomization Analysis of Experimental Data: The Fisher Randomization Test Comment , 1980 .

[22]  Elizabeth,et al.  Matching Methods for Causal Inference , 2007 .

[23]  D B Rubin,et al.  Robust estimation of causal effects of binary treatments in unconfounded studies with dichotomous outcomes , 2013, Statistics in medicine.

[24]  Roderick J. A. Little,et al.  Statistical Analysis with Missing Data: Little/Statistical Analysis with Missing Data , 2002 .

[25]  D. Rubin 2 Statistical Inference for Causal Effects, With Emphasis on Applications in Epidemiology and Medical Statistics , 2007 .

[26]  S. Walker,et al.  A Bayesian approach to non‐parametric monotone function estimation , 2009 .

[27]  P. Mahalanobis On the generalized distance in statistics , 1936 .

[28]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[29]  Joseph Kang,et al.  Demystifying Double Robustness: A Comparison of Alternative Strategies for Estimating a Population Mean from Incomplete Data , 2007, 0804.2958.

[30]  D. Rubin The design versus the analysis of observational studies for causal effects: parallels with the design of randomized trials , 2007, Statistics in medicine.

[31]  E. Stuart,et al.  Using full matching to estimate causal effects in nonexperimental studies: examining the relationship between adolescent marijuana use and adult outcomes. , 2008, Developmental psychology.

[32]  J. Robins,et al.  Doubly Robust Estimation in Missing Data and Causal Inference Models , 2005, Biometrics.

[33]  D. Rubin Estimating causal effects of treatments in randomized and nonrandomized studies. , 1974 .

[34]  D. Rubin Using Propensity Scores to Help Design Observational Studies: Application to the Tobacco Litigation , 2001, Health Services and Outcomes Research Methodology.

[35]  J. Lunceford,et al.  Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study , 2004, Statistics in medicine.

[36]  D. Rubin Matched Sampling for Causal Effects: Matching to Remove Bias in Observational Studies , 1973 .

[37]  R. Little Missing-Data Adjustments in Large Surveys , 1988 .

[38]  Michael E. Sobel,et al.  Causal Inference in the Social Sciences , 2000 .

[39]  Peter M. Steiner,et al.  Can Nonrandomized Experiments Yield Accurate Answers? A Randomized Experiment Comparing Random and Nonrandom Assignments , 2008 .

[40]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[41]  D. Rubin,et al.  Combining Propensity Score Matching with Additional Adjustments for Prognostic Covariates , 2000 .

[42]  Ingeborg Waernbaum,et al.  Model misspecification and robustness in causal inference: comparing matching with doubly robust estimation , 2012, Statistics in medicine.

[43]  Richard W. Hamming,et al.  Error detecting and error correcting codes , 1950 .

[44]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[45]  D. Rubin For objective causal inference, design trumps analysis , 2008, 0811.1640.

[46]  Donald B. Rubin,et al.  Comment : Neyman ( 1923 ) and Causal Inference in Experiments and Observational Studies , 2007 .

[47]  G. Imbens,et al.  Bias-Corrected Matching Estimators for Average Treatment Effects , 2002 .

[48]  B. Hansen Full Matching in an Observational Study of Coaching for the SAT , 2004 .

[49]  Donald B. Rubin,et al.  Formal modes of statistical inference for causal effects , 1990 .

[50]  W. G. Cochran The effectiveness of adjustment by subclassification in removing bias in observational studies. , 1968, Biometrics.

[51]  D. Rubin Direct and Indirect Causal Effects via Potential Outcomes * , 2004 .

[52]  John A. Nelder,et al.  Generalized linear models. 2nd ed. , 1993 .