Internal validation of risk models in clustered data: a comparison of bootstrap schemes.

Internal validity of a risk model can be studied efficiently with bootstrapping to assess possible optimism in model performance. Assumptions of the regular bootstrap are violated when the development data are clustered. We compared alternative resampling schemes in clustered data for the estimation of optimism in model performance. A simulation study was conducted to compare regular resampling on only the patient level with resampling on only the cluster level and with resampling sequentially on both the cluster and patient levels (2-step approach). Optimism for the concordance index and calibration slope was estimated. Resampling of only patients or only clusters showed accurate estimates of optimism in model performance. The 2-step approach overestimated the optimism in model performance. If the number of centers or intraclass correlation coefficient was high, resampling of clusters showed more accurate estimates than resampling of patients. The 3 bootstrap schemes also were applied to empirical data that were clustered. The results presented in this paper support the use of resampling on only the clusters for estimation of optimism in model performance when data are clustered.

[1]  Harry Joe,et al.  Accuracy of Laplace approximation for discrete response mixed models , 2008, Comput. Stat. Data Anal..

[2]  Timothy J. Robinson,et al.  Multilevel Analysis: Techniques and Applications , 2002 .

[3]  S L Hui,et al.  Validation techniques for logistic regression models. , 1991, Statistics in medicine.

[4]  S. Rabe-Hesketh,et al.  Prediction in multilevel generalized linear models , 2009 .

[5]  R. Moineddin,et al.  A simulation study of sample size for multilevel logistic regression models , 2007, BMC medical research methodology.

[6]  Joop J. Hox,et al.  Applied Multilevel Analysis. , 1995 .

[7]  J. Concato,et al.  A simulation study of the number of events per variable in logistic regression analysis. , 1996, Journal of clinical epidemiology.

[8]  S. le Cessie,et al.  Predictive value of statistical models. , 1990, Statistics in medicine.

[9]  P. Janssen,et al.  Resampling Plans for Frailty Models , 2006 .

[10]  Michal Abrahamowicz,et al.  Bootstrap‐based methods for estimating standard errors in Cox's regression analyses of clustered event times , 2010, Statistics in medicine.

[11]  G. A. Marcoulides Multilevel Analysis Techniques and Applications , 2002 .

[12]  Sunil J Rao,et al.  Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis , 2003 .

[13]  H. Goldstein Multilevel Statistical Models , 2006 .

[14]  J. Copas Regression, Prediction and Shrinkage , 1983 .

[15]  F. Harrell,et al.  Prognostic/Clinical Prediction Models: Multivariable Prognostic Models: Issues in Developing Models, Evaluating Assumptions and Adequacy, and Measuring and Reducing Errors , 2005 .

[16]  Sandra Eldridge,et al.  Patterns of intra-cluster correlation from primary care research to inform study design and analysis. , 2004, Journal of clinical epidemiology.

[17]  Thomas Agoritsas,et al.  Performance of logistic regression modeling: beyond the number of events per variable, the role of data structure. , 2011, Journal of clinical epidemiology.

[18]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[19]  H C Van Houwelingen,et al.  Construction, validation and updating of a prognostic model for kidney graft survival. , 1995, Statistics in medicine.

[20]  J. Habbema,et al.  Internal validation of predictive models: efficiency of some procedures for logistic regression analysis. , 2001, Journal of clinical epidemiology.

[21]  C.J.H. Mann,et al.  Clinical Prediction Models: A Practical Approach to Development, Validation and Updating , 2009 .

[22]  S. Raudenbush,et al.  Maximum Likelihood for Generalized Linear Models with Nested Random Effects via High-Order, Multivariate Laplace Approximation , 2000 .

[23]  E W Steyerberg,et al.  Stepwise selection in small data sets: a simulation study of bias in logistic regression analysis. , 1999, Journal of clinical epidemiology.

[24]  E. Lesaffre,et al.  An application of Harrell's C‐index to PH frailty models , 2010, Statistics in medicine.

[25]  Cor J Kalkman,et al.  Does Measurement of Preoperative Anxiety Have Added Value for Predicting Postoperative Nausea and Vomiting? , 2005, Anesthesia and analgesia.

[26]  A. Scott,et al.  The Effect of Two-Stage Sampling on Ordinary Least Squares Methods , 1982 .

[27]  Daniel B. Mark,et al.  TUTORIAL IN BIOSTATISTICS MULTIVARIABLE PROGNOSTIC MODELS: ISSUES IN DEVELOPING MODELS, EVALUATING ASSUMPTIONS AND ADEQUACY, AND MEASURING AND REDUCING ERRORS , 1996 .

[28]  Karel G M Moons,et al.  Ruling out deep venous thrombosis in primary care , 2005, Thrombosis and Haemostasis.

[29]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[30]  S Senn,et al.  Some controversies in planning and analysing multi-centre trials. , 1998, Statistics in medicine.

[31]  Juan Lu,et al.  Predicting Outcome after Traumatic Brain Injury: Development and International Validation of Prognostic Scores Based on Admission Characteristics , 2008, PLoS medicine.