How to Establish Clinical Prediction Models

A clinical prediction model can be applied to several challenging clinical scenarios: screening high-risk individuals for asymptomatic disease, predicting future events such as disease or death, and assisting medical decision-making and health education. Despite the impact of clinical prediction models on practice, prediction modeling is a complex process requiring careful statistical analyses and sound clinical judgement. Although there is no definite consensus on the best methodology for model development and validation, a few recommendations and checklists have been proposed. In this review, we summarize five steps for developing and validating a clinical prediction model: preparation for establishing clinical prediction models; dataset selection; handling variables; model generation; and model evaluation and validation. We also review several studies that detail methods for developing clinical prediction models with comparable examples from real practice. After model development and vigorous validation in relevant settings, possibly with evaluation of utility/usability and fine-tuning, good models can be ready for the use in practice. We anticipate that this framework will revitalize the use of predictive or prognostic research in endocrinology, leading to active applications in real clinical practice.

[1]  David M. Eddy,et al.  Diabetes Risk Calculator , 2008, Diabetes Care.

[2]  Xiao-Hua Zhou,et al.  The need for reorientation toward cost‐effective prediction: Comments on ‘Evaluating the added predictive ability of a new marker: From area under the ROC curve to reclassification and beyond’ by Pencina et al., Statistics in Medicine (DOI: 10.1002/sim.2929) , 2008, Statistics in medicine.

[3]  Douglas G Altman,et al.  Dichotomizing continuous predictors in multiple regression: a bad idea , 2006, Statistics in medicine.

[4]  Galit Shmueli,et al.  To Explain or To Predict? , 2010, 1101.0891.

[5]  Willi Sauerbrei,et al.  The Use of Resampling Methods to Simplify Regression Models in Medical Statistics , 1999 .

[6]  Gary S Collins,et al.  Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD): The TRIPOD Statement. , 2015, European urology.

[7]  Jørgen Hilden,et al.  Commentary: On NRI, IDI, and "good-looking" statistics with nothing underneath. , 2014, Epidemiology.

[8]  D G Altman,et al.  What do we mean by validating a prognostic model? , 2000, Statistics in medicine.

[9]  G. Collins,et al.  Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): The TRIPOD Statement , 2015, Annals of Internal Medicine.

[10]  Heejung Bang,et al.  Development and Validation of a Patient Self-assessment Score for Diabetes Risk , 2009, Annals of Internal Medicine.

[11]  Richard D Riley,et al.  Prognosis research strategy (PROGRESS) 1: A framework for researching clinical outcomes , 2013, BMJ : British Medical Journal.

[12]  Hyun-Young Park,et al.  A risk score for predicting the incidence of type 2 diabetes in a middle-aged Korean cohort: the Korean genome and epidemiology study. , 2012, Circulation journal : official journal of the Japanese Circulation Society.

[13]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[14]  Yvonne Vergouwe,et al.  Prognosis and prognostic research: validating a prognostic model , 2009, BMJ : British Medical Journal.

[15]  Yvonne Vergouwe,et al.  Prognosis and prognostic research: what, why, and how? , 2009, BMJ : British Medical Journal.

[16]  C.J.H. Mann,et al.  Clinical Prediction Models: A Practical Approach to Development, Validation and Updating , 2009 .

[17]  Tae-Yong Lee,et al.  A coronary heart disease prediction model: the Korean Heart Study , 2014, BMJ Open.

[18]  Jeroen J. Bax,et al.  Predictors of cardiac events after major vascular surgery: Role of clinical characteristics, dobutamine echocardiography, and beta-blocker therapy. , 2001, JAMA.

[19]  Sander Greenland,et al.  The need for reorientation toward cost‐effective prediction: Comments on ‘Evaluating the added predictive ability of a new marker: From area under the ROC curve to reclassification and beyond’ by M. J. Pencina et al., Statistics in Medicine (DOI: 10.1002/sim.2929) , 2008, Statistics in medicine.

[20]  G W Sun,et al.  Inappropriate use of bivariable analysis to screen risk factors for use in multivariable analysis. , 1996, Journal of clinical epidemiology.

[21]  H. Bang,et al.  A Simple Screening Score for Diabetes for the Korean Population , 2012, Diabetes Care.

[22]  Leo Breiman,et al.  Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) , 2001 .

[23]  N. Obuchowski,et al.  Assessing the Performance of Prediction Models: A Framework for Traditional and Novel Measures , 2010, Epidemiology.

[24]  E. Steyerberg,et al.  Prognosis Research Strategy (PROGRESS) 3: Prognostic Model Research , 2013, PLoS medicine.

[25]  Yvonne Vergouwe,et al.  Prognosis and prognostic research: Developing a prognostic model , 2009, BMJ : British Medical Journal.

[26]  P. Royston,et al.  Prognosis and prognostic research: application and impact of prognostic models in clinical practice , 2009, BMJ : British Medical Journal.

[27]  Henry Völzke,et al.  Development, External Validation, and Comparative Assessment of a New Diagnostic Score for Hepatic Steatosis , 2014, The American Journal of Gastroenterology.

[28]  N. Lundbom,et al.  Prediction of non-alcoholic fatty liver disease and liver fat using metabolic and genetic factors. , 2009, Gastroenterology.

[29]  Yvonne Vergouwe,et al.  Towards better clinical prediction models: seven steps for development and an ABCD for validation. , 2014, European heart journal.

[30]  J. Wyatt,et al.  Commentary: Prognostic models: clinically useful or quickly forgotten? , 1995 .

[31]  Sunil J Rao,et al.  Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis , 2003 .

[32]  A. Laupacis,et al.  Clinical prediction rules. A review and suggested modifications of methodological standards. , 1997, JAMA.

[33]  Dong Zhao,et al.  Predictive value for the Chinese population of the Framingham CHD risk assessment tool compared with the Chinese Multi-Provincial Cohort Study. , 2004, JAMA.

[34]  Jennifer G. Robinson,et al.  2013 ACC/AHA Guideline on the Assessment of Cardiovascular Risk: A Report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines , 2014, Circulation.

[35]  H. Bang,et al.  Non–Laboratory-Based Self-Assessment Screening Score for Non-Alcoholic Fatty Liver Disease: Development, Validation and Comparison with Other Scores , 2014, PloS one.

[36]  Jeroen J. Bax,et al.  Predictors of Cardiac Events After Major Vascular Surgery Role of Clinical Characteristics, Dobutamine Echocardiography, and b-Blocker Therapy , 2001 .

[37]  N. Wareham,et al.  Diabetes risk score: towards earlier detection of Type 2 diabetes in general practice , 2000, Diabetes/metabolism research and reviews.

[38]  Jennifer G. Robinson,et al.  2013 ACC/AHA Guideline on the Assessment of Cardiovascular Risk: A Report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines , 2014, Circulation.

[39]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[40]  D. Sackett,et al.  Evidence based medicine: what it is and what it isn't , 1996, BMJ.