Practice of Epidemiology Variable Selection for Propensity Score Models

Despite the growing popularity of propensity score (PS) methods in epidemiology, relatively little has been written in the epidemiologic literature about the problem of variable selection for PS models. The authors present the results of two simulation studies designed to help epidemiologists gain insight into the variable selection problem in a PS analysis. The simulation studies illustrate how the choice of variables that are included in a PS model can affect the bias, variance, and mean squared error of an estimated exposure effect. The results suggest that variables that are unrelated to the exposure but related to the outcome should always be included in a PS model. The inclusion of these variables will decrease the variance of an estimated exposure effect without increasing bias. In contrast, including variables that are related to the exposure but not to the outcome will increase the variance of the estimated exposure effect without decreasing bias. In very small studies, the inclusion of variables that are strongly related to the exposure but only weakly related to the outcome can be detrimental to an estimate in a mean squared error sense. The addition of these variables removes only a small amount of bias but can increase the variance of the estimated exposure effect. These simulation studies and other analytical results suggest that standard model-building tools designed to create good predictive models of the exposure will not always lead to optimal PS models, particularly in small studies.

[1]  Til Stürmer,et al.  A review of the application of propensity score methods yielded increasing use, advantages in specific settings, but not substantially different estimates compared with conventional multivariable methods. , 2006, Journal of clinical epidemiology.

[2]  Mark J. van der Laan,et al.  A semiparametric model selection criterion with applications to the marginal structural model , 2006, Comput. Stat. Data Anal..

[3]  Donald B Rubin,et al.  On principles for modeling propensity scores in medical research , 2004, Pharmacoepidemiology and drug safety.

[4]  Vincent Mor,et al.  Principles for modeling propensity scores in medical research: a systematic literature review , 2004, Pharmacoepidemiology and drug safety.

[5]  G. Imbens,et al.  Estimation of Causal Effects using Propensity Score Weighting: An Application to Data on Right Heart Catheterization , 2001, Health Services and Outcomes Research Methodology.

[6]  S. Dudoit,et al.  Unified Cross-Validation Methodology For Selection Among Estimators and a General Cross-Validated Adaptive Epsilon-Net Estimator: Finite Sample Oracle Inequalities and Examples , 2003 .

[7]  Margaret T May,et al.  Regression Modelling Strategies with Applications to Linear Models, Logistic Regression, and Survival Analysis. Frank E Harrell Jr, New York: Springer 2001, pp. 568, $79.95. ISBN 0-387-95232-2. , 2002 .

[8]  J. Robins,et al.  Marginal Structural Models and Causal Inference in Epidemiology , 2000, Epidemiology.

[9]  Xiao-Hua Zhou,et al.  The use of propensity scores in pharmacoepidemiologic research , 2000, Pharmacoepidemiology and drug safety.

[10]  P. Rosenbaum,et al.  Invited commentary: propensity scores. , 1999, American journal of epidemiology.

[11]  R. D'Agostino Adjustment Methods: Propensity Score Methods for Bias Reduction in the Comparison of a Treatment to a Non‐Randomized Control Group , 2005 .

[12]  Ross Ihaka,et al.  Gentleman R: R: A language for data analysis and graphics , 1996 .

[13]  D B Rubin,et al.  Matching using estimated propensity scores: relating theory to practice. , 1996, Biometrics.

[14]  J. Robins,et al.  Estimating exposure effects by modelling the expectation of exposure conditional on confounders. , 1992, Biometrics.

[15]  P. Rosenbaum Model-Based Direct Adjustment , 1987 .

[16]  M. Gail,et al.  Biased estimates of treatment effect in randomized experiments with nonlinear regressions and omitted covariates , 1984 .

[17]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[18]  W. G. Cochran,et al.  Controlling Bias in Observational Studies: A Review. , 1974 .