Defining the Study Population for an Observational Study to Ensure Sufficient Overlap: A Tree Approach

The internal validity of an observational study is enhanced by only comparing sets of treated and control subjects which have sufficient overlap in their covariate distributions. Methods have been developed for defining the study population using propensity scores to ensure sufficient overlap. However, a study population defined by propensity scores is difficult for other investigators to understand. We develop a method of defining a study population in terms of a tree which is easy to understand and display, and that has similar internal validity as that of the study population defined by propensity scores.

[1]  D. Horvitz,et al.  A Generalization of Sampling Without Replacement from a Finite Universe , 1952 .

[2]  R. C. Macridis A review , 1963 .

[3]  W. G. Cochran The effectiveness of adjustment by subclassification in removing bias in observational studies. , 1968, Biometrics.

[4]  W. G. Cochran,et al.  Controlling Bias in Observational Studies: A Review. , 1974 .

[5]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[6]  R. Lalonde Evaluating the Econometric Evaluations of Training Programs with Experimental Data , 1984 .

[7]  Paul R. Rosenbaum,et al.  Optimal Matching for Observational Studies , 1989 .

[8]  G. Guyatt A Randomized Control Trial of Right-Heart Catheterization in Critically Ill Patients , 1991 .

[9]  J. Robins,et al.  Semiparametric Efficiency in Multivariate Regression Models with Missing Data , 1995 .

[10]  L. Goldman,et al.  The effectiveness of right heart catheterization in the initial care of critically ill patients. SUPPORT Investigators. , 1996, JAMA.

[11]  William A. Knaus,et al.  The effectiveness of right heart catheterization in the initial care of critically ill patients. SUPPORT Investigators. , 1996, Journal of the American Medical Association (JAMA).

[12]  J. Hahn On the Role of the Propensity Score in Efficient Semiparametric Estimation of Average Treatment Effects , 1998 .

[13]  T. Shakespeare,et al.  Observational Studies , 2003 .

[14]  G. Imbens,et al.  Efficient Estimation of Average Treatment Effects Using the Estimated Propensity Score , 2000 .

[15]  P. Todd,et al.  Evaluating Preschool Programs When Length of Exposure to the Program Varies: A Nonparametric Approach , 2000 .

[16]  Jeffrey A. Smith,et al.  Does Matching Overcome Lalonde's Critique of Nonexperimental Estimators? , 2000 .

[17]  G. Imbens,et al.  Efficient Estimation of Average Treatment Effects Using the Estimated Propensity Score , 2002 .

[18]  J. Vincent,et al.  Anemia and blood transfusion in critically ill patients. , 2002, JAMA.

[19]  M. Grzybowski,et al.  Mortality benefit of immediate revascularization of acute ST-segment elevation myocardial infarction in patients with contraindications to thrombolytic therapy: a propensity analysis. , 2003, JAMA.

[20]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[21]  B. Hansen Full Matching in an Observational Study of Coaching for the SAT , 2004 .

[22]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[23]  Introduction to the Symposium on the Econometrics of Matching , 2004, Review of Economics and Statistics.

[24]  Alan Agresti,et al.  Effects and non‐effects of paired identical observations in comparing proportions with binary matched‐pairs data , 2004, Statistics in medicine.

[25]  Richard K. Crump,et al.  Dealing with limited overlap in estimation of average treatment effects , 2009 .

[26]  Peter C Austin,et al.  The performance of different propensity-score methods for estimating differences in proportions (risk differences or absolute risk reductions) in observational studies , 2010, Statistics in medicine.

[27]  Elizabeth A Stuart,et al.  Matching methods for causal inference: A review and a look forward. , 2010, Statistical science : a review journal of the Institute of Mathematical Statistics.

[28]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[29]  Paul R. Rosenbaum,et al.  Optimal Matching of an Optimally Chosen Subset in Observational Studies , 2012 .