On resampling methods for model assessment in penalized and unpenalized logistic regression

Penalized logistic regression methods are frequently used to investigate the relationship between a binary outcome and a set of explanatory variables. The model performance can be assessed by measures such as the concordance statistic (c-statistic), the discrimination slope and the Brier score. Often, data resampling techniques, e.g. crossvalidation, are employed to correct for optimism in these model performance criteria. Especially with small samples or a rare binary outcome variable, leave-oneout crossvalidation is a popular choice. Using simulations and a real data example, we compared the effect of different resampling techniques on the estimation of c-statistics, discrimination slopes and Brier scores for three estimators of logistic regression models, including the maximum likelihood and two maximum penalized-likelihood estimators. Our simulation study confirms earlier studies reporting that leave-one-out crossvalidated c-statistics can be strongly biased towards zero. In addition, our study reveals that this bias is more pronounced for estimators shrinking predicted probabilities towards the observed event rate, such as ridge regression. Leave-one-out crossvalidation also provided pessimistic estimates of the discrimination slope but nearly unbiased estimates of the Brier score. We recommend to use leave-pair-out crossvalidation, five-fold crossvalidation with repetition, the enhanced or the .632+ bootstrap to estimate c-statistics and leave-pair-out or five-fold crossvalidation to estimate discrimination slopes.

[1]  J. Schorling,et al.  Prevalence of Coronary Heart Disease Risk Factors Among Rural Blacks: A Community-Based Study , 1997, Southern medical journal.

[2]  Galit Shmueli,et al.  To Explain or To Predict? , 2010, 1101.0891.

[3]  Davide Paolo Bernasconi,et al.  Graphical representations and summary indicators to assess the performance of risk predictors , 2018, Biometrical journal. Biometrische Zeitschrift.

[4]  Michael J Crowther,et al.  Using simulation studies to evaluate statistical methods , 2017, Statistics in medicine.

[5]  Sander Greenland,et al.  Separation in Logistic Regression: Causes, Consequences, and Control. , 2018, American journal of epidemiology.

[6]  Sunil J Rao,et al.  Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis , 2003 .

[7]  S. Haneuse,et al.  On the Assessment of Monte Carlo Error in Simulation-Based Statistical Analyses , 2009, The American statistician.

[8]  P. J. Verweij,et al.  Penalized likelihood in Cox regression. , 1994, Statistics in medicine.

[9]  Tapio Salakoski,et al.  A comparison of AUC estimators in small-sample studies , 2009, MLSB.

[10]  N. Obuchowski,et al.  Assessing the Performance of Prediction Models: A Framework for Traditional and Novel Measures , 2010, Epidemiology.

[11]  D. Firth Bias reduction of maximum likelihood estimates , 1993 .

[12]  M. Schemper,et al.  A solution to the problem of separation in logistic regression , 2002, Statistics in medicine.

[13]  R. Tibshirani,et al.  Improvements on Cross-Validation: The 632+ Bootstrap Method , 1997 .

[14]  Patrick Royston,et al.  Correcting for Optimistic Prediction in Small Data Sets , 2014, American journal of epidemiology.

[15]  Patrick Royston,et al.  Multivariable model-building with continuous covariates : 1 . Performance measures and simulation design , 2011 .

[16]  Tapio Pahikkala,et al.  Tournament leave-pair-out cross-validation for receiver operating characteristic analysis , 2018, Statistical methods in medical research.

[17]  Trevor Hastie,et al.  An Introduction to Statistical Learning , 2013, Springer Texts in Statistics.

[18]  Tue Tjur,et al.  Coefficients of Determination in Logistic Regression Models—A New Proposal: The Coefficient of Discrimination , 2009 .

[19]  S. Cessie,et al.  Ridge Estimators in Logistic Regression , 1992 .