Minimum sample size for developing a multivariable prediction model: PART II ‐ binary and time‐to‐event outcomes

When designing a study to develop a new prediction model with binary or time‐to‐event outcomes, researchers should ensure their sample size is adequate in terms of the number of participants (n) and outcome events (E) relative to the number of predictor parameters (p) considered for inclusion. We propose that the minimum values of n and E (and subsequently the minimum number of events per predictor parameter, EPP) should be calculated to meet the following three criteria: (i) small optimism in predictor effect estimates as defined by a global shrinkage factor of ≥0.9, (ii) small absolute difference of ≤ 0.05 in the model's apparent and adjusted Nagelkerke's R2, and (iii) precise estimation of the overall risk in the population. Criteria (i) and (ii) aim to reduce overfitting conditional on a chosen p, and require prespecification of the model's anticipated Cox‐Snell R2, which we show can be obtained from previous studies. The values of n and E that meet all three criteria provides the minimum sample size required for model development. Upon application of our approach, a new diagnostic model for Chagas disease requires an EPP of at least 4.8 and a new prognostic model for recurrent venous thromboembolism requires an EPP of at least 23. This reinforces why rules of thumb (eg, 10 EPP) should be avoided. Researchers might additionally ensure the sample size gives precise estimates of key predictor effects; this is especially important when key categorical predictors have few events in some categories, as this may substantially increase the numbers required.

[1]  Richard D Riley,et al.  Prediction of risk of recurrence of venous thromboembolism following treatment for a first unprovoked venous thromboembolism: systematic review, prognostic model and clinical decision rule, and economic evaluation. , 2016, Health technology assessment.

[2]  L. Hooft,et al.  A guide to systematic review and meta-analysis of prediction model performance , 2017, British Medical Journal.

[3]  A Rogier T Donders,et al.  Penalized maximum likelihood estimation to directly adjust diagnostic and prognostic prediction models for overoptimism: a clinical example. , 2004, Journal of clinical epidemiology.

[4]  Paul C. Lambert,et al.  Flexible Parametric Survival Analysis Using Stata: Beyond the Cox Model , 2011 .

[5]  J. C. van Houwelingen,et al.  Predictive value of statistical models , 1990 .

[6]  Douglas G. Altman,et al.  Adequate sample size for developing prediction models is not simply related to events per variable , 2016, Journal of clinical epidemiology.

[7]  J. Kiefer,et al.  Asymptotic Minimax Character of the Sample Distribution Function and of the Classical Multinomial Estimator , 1956 .

[8]  S Van Huffel,et al.  A simulation study of sample size demonstrated the importance of the number of events per variable to develop prediction models in clustered data. , 2015, Journal of clinical epidemiology.

[9]  R. Riley,et al.  Development and validation of risk prediction model for venous thromboembolism in postpartum women: multinational cohort study , 2016, British Medical Journal.

[10]  Patrick Royston,et al.  Explained Variation for Survival Models , 2006 .

[11]  Charles E McCulloch,et al.  Relaxing the rule of ten events per variable in logistic and Cox regression. , 2007, American journal of epidemiology.

[12]  David Hinkley,et al.  Bootstrap Methods: Another Look at the Jackknife , 2008 .

[13]  J. Copas,et al.  Using regression models for prediction: shrinkage and regression to the mean , 1997, Statistical methods in medical research.

[14]  D. Bloch,et al.  A simple method of sample size calculation for linear and logistic regression. , 1998, Statistics in medicine.

[15]  A. Sheikh,et al.  Predicting cardiovascular risk in England and Wales: prospective derivation and validation of QRISK2 , 2008, BMJ : British Medical Journal.

[16]  R. Blamey,et al.  A prognostic index in primary breast cancer. , 1982, British Journal of Cancer.

[17]  Douglas G. Altman,et al.  No rationale for 1 variable per 10 events criterion for binary logistic regression analysis , 2016, BMC Medical Research Methodology.

[18]  Harry Hemingway,et al.  Developing and validating a cardiovascular risk score for patients in the community with prior cardiovascular disease , 2017, Heart.

[19]  P. Austin,et al.  Events per variable (EPV) and the relative performance of different strategies for estimating the out-of-sample validity of logistic regression models , 2014, Statistical methods in medical research.

[20]  Does my patient have chronic Chagas disease? Development and temporal validation of a diagnostic risk score. , 2016, Revista da Sociedade Brasileira de Medicina Tropical.

[21]  B. Efron Bootstrap Methods: Another Look at the Jackknife , 1979 .

[22]  Yvonne Vergouwe,et al.  Prognosis and prognostic research: Developing a prognostic model , 2009, BMJ : British Medical Journal.

[23]  Ewout W Steyerberg,et al.  Modern modelling techniques are data hungry: a simulation study for predicting dichotomous endpoints , 2014, BMC Medical Research Methodology.

[24]  L. Magee,et al.  R 2 Measures Based on Wald and Likelihood Ratio Joint Significance Tests , 1990 .

[25]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[26]  Patrick Royston,et al.  Discrimination-based sample size calculations for multivariable prognostic models for time-to-event data , 2015, BMC Medical Research Methodology.

[27]  Thomas Agoritsas,et al.  Performance of logistic regression modeling: beyond the number of events per variable, the role of data structure. , 2011, Journal of clinical epidemiology.

[28]  P. Royston,et al.  Flexible parametric proportional‐hazards and proportional‐odds models for censored survival data, with application to prognostic modelling and estimation of treatment effects , 2002, Statistics in medicine.

[29]  J. Copas Regression, Prediction and Shrinkage , 1983 .

[30]  E. Steyerberg Clinical Prediction Models , 2008, Statistics for Biology and Health.

[31]  Janis Bormanis,et al.  Value of assessment of pretest probability of deep-vein thrombosis in clinical management , 1997, The Lancet.

[32]  Joseph R. Rausch,et al.  Sample size planning for statistical power and accuracy in parameter estimation. , 2008, Annual review of psychology.

[33]  J. Concato,et al.  Importance of events per independent variable in proportional hazards regression analysis. II. Accuracy and precision of regression estimates. , 1995, Journal of clinical epidemiology.

[34]  David R. Cox The analysis of binary data , 1970 .

[35]  J. Concato,et al.  A simulation study of the number of events per variable in logistic regression analysis. , 1996, Journal of clinical epidemiology.

[36]  Gary S Collins,et al.  Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and Elaboration , 2015, Annals of Internal Medicine.

[37]  J. C. van Houwelingen,et al.  Shrinkage and Penalized Likelihood as Methods to Improve Predictive Accuracy , 2001 .

[38]  James E. Helmreich Regression Modeling Strategies with Applications to Linear Models, Logistic and Ordinal Regression and Survival Analysis (2nd Edition) , 2016 .

[39]  A. H. Feiveson,et al.  Power by Simulation , 2002 .

[40]  P Peduzzi,et al.  Importance of events per independent variable in proportional hazards analysis. I. Background, goals, and general strategy. , 1995, Journal of clinical epidemiology.

[41]  E. Steyerberg,et al.  Prognosis Research Strategy (PROGRESS) 3: Prognostic Model Research , 2013, PLoS medicine.

[42]  M Borenstein,et al.  Planning for precision in survival studies. , 1994, Journal of clinical epidemiology.

[43]  Harvey J Cohen,et al.  An Overview of Variance Inflation Factors for Sample-Size Calculation , 2003, Evaluation & the health professions.

[44]  Carol Coupland,et al.  Development and validation of QDiabetes-2018 risk prediction algorithm to estimate future risk of type 2 diabetes: cohort study , 2017, British Medical Journal.

[45]  M Schumacher,et al.  Sample size considerations for the evaluation of prognostic factors in survival analysis. , 2000, Statistics in medicine.

[46]  John O'Quigley,et al.  Explained randomness in proportional hazards models , 2005, Statistics in medicine.

[47]  Gareth Ambler,et al.  How to develop a more accurate risk prediction model when there are few events , 2015, BMJ : British Medical Journal.

[48]  C.J.H. Mann,et al.  Clinical Prediction Models: A Practical Approach to Development, Validation and Updating , 2009 .

[49]  K. Anderson,et al.  Cardiovascular disease risk profiles. , 1991, American heart journal.

[50]  Econometric Modeling: A Likelihood Approach , 2007 .

[51]  Gowri Raman,et al.  Tufts PACE Clinical Predictive Model Registry: update 1990 through 2015 , 2017, Diagnostic and Prognostic Research.

[52]  P W Lavori,et al.  Sample-size calculations for the Cox proportional hazards regression model with nonbinary covariates. , 2000, Controlled clinical trials.

[53]  Richard D Riley,et al.  Minimum sample size for developing a multivariable prediction model: Part I – Continuous outcomes , 2018, Statistics in medicine.

[54]  I. Ellis,et al.  The Nottingham prognostic index in primary breast cancer , 2005, Breast Cancer Research and Treatment.

[55]  Patrick Royston,et al.  A new measure of prognostic separation in survival data , 2004, Statistics in medicine.

[56]  E. Steyerberg,et al.  Prognosis Research Strategy (PROGRESS) 2: Prognostic Factor Research , 2013, PLoS medicine.

[57]  N. Nagelkerke,et al.  A note on a general definition of the coefficient of determination , 1991 .

[58]  O. Dekkers,et al.  Predicting Mortality in Patients with Diabetes Starting Dialysis , 2014, PloS one.

[59]  D. McFadden Conditional logit analysis of qualitative choice behavior , 1972 .

[60]  Maarten van Smeden,et al.  Sample size for binary logistic prediction models: Beyond events per variable criteria , 2018, Statistical methods in medical research.

[61]  M Gent,et al.  Derivation of a Simple Clinical Model to Categorize Patients Probability of Pulmonary Embolism: Increasing the Models Utility with the SimpliRED D-dimer , 2000, Thrombosis and Haemostasis.