PROBAST: A Tool to Assess Risk of Bias and Applicability of Prediction Model Studies: Explanation and Elaboration

Prediction models in health care aim to predict for an individual whether a particular outcome, such as disease, is present (diagnostic models) or whether it will occur in the future (prognostic models) (16). Diagnostic models can be used to refer patients for further testing, to initiate treatment, or to inform patients. Prognostic models can be used to aid decisions about preventive lifestyle changes, therapeutic interventions, or monitoring strategies or to stratify risk in randomized trial design and analysis (7, 8). Potential users of prediction models include health care professionals, policymakers, guideline developers, patients, and the general public. The medical literature contains thousands of studies developing and validating prediction models and often has numerous prediction models for the same target population and outcomes. For example, more than 60 models address breast cancer prognosis (9), more than 250 exist in obstetrics (10), and nearly 800 predict outcomes in patients with cardiovascular disease (11). This proliferation of prediction models will increase further with the growth of personalized or precision medicine. Systematic reviews are considered the most reliable form of evidence when addressing randomized therapeutic studies and studies of diagnostic test accuracy (12). In the era of personalized and precision medicine, interest in systematic reviews of prediction model studies is rapidly growing, as exemplified by the formation of the Cochrane Prognosis Methods Group to support systematic reviews of prognosis, including prediction model studies (13, 14). Guidance to facilitate systematic reviews of prediction models has been developed (Table 1), including for search strategies (15, 4143), formulation of the review question (16, 17), data extraction (16), and meta-analysis (17, 2225, 40, 44, 45). Table 1. Guidance on Conducting Systematic Reviews of Prediction Model Studies Assessment of risk of bias (ROB) is an essential step in any systematic review. Shortcomings in study design, conduct, and analysis can result in study estimates being at ROBthat is, at risk of results being flawed or distorted. When interpreting results from a systematic review, readers can draw stronger conclusions from a review based on primary studies at low ROB than from one based on studies at high or unclear ROB (46). Identifying the studies most relevant to the settings and populations targeted in the review (based on the applicability of primary studies to the review question) is also important. We therefore developed PROBAST (Prediction model Risk Of Bias ASsessment Tool) to address the lack of suitable tools designed specifically to assess ROB and applicability of primary prediction model studies. PROBAST consists of 4 domains containing 20 signaling questions to facilitate ROB assessment (39). The structure and rating system are similar to those in tools designed to assess ROB in randomized trials (revised Cochrane ROB Tool [ROB 2.0]), diagnostic accuracy studies (QUADAS-2 [Quality Assessment of Diagnostic Accuracy Studies 2]), and systematic reviews (ROBIS) (37, 47, 48). Although PROBAST was designed for use in systematic reviews of prediction model studies, it can also be used as a general tool for critical appraisal of (primary) prediction model studies. Here, we describe the rationale behind the domains and signaling questions, how to use them, and how to reach domain-level and overall judgments about ROB and applicability of primary studies to a review question. At the Web site (www.probast.org), 5 filled-in examples from across the medical field illustrate these processes. Because this is an area of active research, the tool, examples, and accompanying guidance will be updated when needed, and the latest version of PROBAST should always be downloaded from the Web site. Focus of PROBAST PROBAST is designed to assess primary studies that develop, validate, or update (for example, extend) multivariable prediction models for diagnosis or prognosis (Boxes 1 and 2). A multivariable prediction model is defined as any combination or equation of 2 or more predictors (such as age, sex, symptoms, signs, disease stage, or biomarkers) for estimating for an individual the probability or risk of having (diagnosis) or developing (prognosis) a particular outcome (1, 4, 68, 49, 50). Other names for prediction model include risk prediction model, predictive model, prediction index or rule, and risk score (1, 38, 4951). Box 1. Types of diagnostic and prognostic modeling studies or reports addressed by PROBAST. Adopted from the TRIPOD (Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis) and CHARMS (CHecklist for critical Appraisal and data extraction for systematic Reviews of prediction Modelling Studies) guidance (8, 16). Box 2. Differences between diagnostic and prognostic prediction model studies. PROBAST = Prediction model Risk Of Bias ASsessment Tool. Diagnostic and Prognostic Models Diagnostic prediction models estimate the probability that a certain outcome, the target condition, is currently present. Diagnostic prediction model studies typically include individuals who are suspectedbut not yet knownto have the target condition. Prognostic prediction models estimate the probability that a future outcome or event will occur, such as death, disease recurrence, disease complication, or therapy response. The time period of prediction can vary from hours (for example, preoperatively predicting postoperative nausea and vomiting) to years (for example, predicting lifelong risk for a coronary event). Although many prognostic models enroll patients with an established diagnosis, this does not have to be the starting point, as seen in models for predicting development of diabetes in pregnant women (52) or of osteoporotic fractures in the general population (53). Consistent with the TRIPOD (Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis) statement (7, 8), PROBAST thus broadly defines prognostic models as those predicting a future outcome in persons at risk for that outcome. Diagnostic and prognostic model studies often use different terms for predictors and outcomes (Box 2). The cancer literature frequently distinguishes between prognostic and predictive models, such that predictive models identify individuals with differential treatment effects (54). These types of (predictive) models are outside the scope of this article. Types of Predictors, Outcomes, and Modeling Techniques PROBAST can be used to assess any type of diagnostic or prognostic prediction model aimed at individualized predictions regardless of the predictors used; outcomes being predicted; or methods used to develop, validate, or update (for example, extend) the model. Predictors range from demographic characteristics, medical history, and physical examination results; to imaging results, electrophysiology, blood, urine, or tissue measurements, and disease stages or characteristics; to results from omics and other new biological measurements. Predictors are also called covariates, risk indicators, prognostic factors, determinants, index test results, or independent variables (4, 68, 49, 50, 55, 56, 57). PROBAST distinguishes between candidate predictors and predictors included in the final model (57). Candidate predictors are variables considered potentially predictive of the outcome presence (diagnosis) or occurrence (prognosis)that is, all those evaluated in the study regardless of whether they are included in the final multivariable model. PROBAST primarily addresses prediction models for binary and time-to-event outcomes because these are the most common in medicine. However, the tool can also be used to assess models predicting nonbinary outcomes, such as continuous scores (for example, pain scores or cholesterol levels) or categorical outcomes (for example, the Glasgow Coma Scale). Almost all PROBAST signaling questions apply equally to continuous and categorical outcomes, except questions addressing number of outcome events per predictor and certain measures of model performance (such as the c-statistic), which are not relevant to continuous outcomes. Prediction models usually involve regression modeling techniques, such as logistic regression or survival models. Prediction models may also be developed or validated using nonregression techniques, such as neural networks, random forests, or support vector machines. As the use of routine care (and big) data increases, additional modeling techniques are becoming more common, including machine and artificial learning models. The main differences between studies using regression and other types of prediction modeling include the methods of data analysis; nonregression development models can often have greater risks of overfitting when data are sparse, and the potential lack of transparency can affect the applicability and usability of the models. In the section on tailoring PROBAST with additional signaling questions, we provide guidance about how PROBAST can be adapted to address other types of outcomes and modeling techniques. Types of Review Questions PROBAST can be used to assess different types of systematic review questions. For some review questions, all prediction model studies are relevant (including both development and validation), but for other questions only validation studies are relevant. Box 3 gives examples of potential review questions for both prognostic and diagnostic prediction models where PROBAST is applicable. CHARMS (CHecklist for critical Appraisal and data extraction for systematic Reviews of prediction Modelling Studies) and Table 2 provide explicit guidance on how to frame a focused question for reviews of prediction model studies (16, 17). Box 3. Examples of systematic review questions for which PROBAST is suitable. There are various different questions that systematic reviews of

[1]  Karel G M Moons,et al.  Non-invasive risk scores for prediction of type 2 diabetes (EPIC-InterAct): a validation of existing models. , 2014, The lancet. Diabetes & endocrinology.

[2]  J. Habbema,et al.  Internal validation of predictive models: efficiency of some procedures for logistic regression analysis. , 2001, Journal of clinical epidemiology.

[3]  Richard D Riley,et al.  Meta-analysis of prediction model performance across multiple studies: Which scale helps ensure between-study normality for the C-statistic and calibration measures? , 2017, Statistical methods in medical research.

[4]  L. Hooft,et al.  A guide to systematic review and meta-analysis of prediction model performance , 2017, British Medical Journal.

[5]  J. Glanville,et al.  Searching for Studies , 2008 .

[6]  D G Altman,et al.  What do we mean by validating a prognostic model? , 2000, Statistics in medicine.

[7]  Nandini Dendukuri,et al.  Modeling conditional dependence between diagnostic tests: A multiple latent variable model , 2009, Statistics in medicine.

[8]  Diederick E. Grobbee,et al.  Clinical Epidemiology: Principles, Methods, and Applications for Clinical Research , 2008 .

[9]  Yvonne Vergouwe,et al.  Prognosis and prognostic research: validating a prognostic model , 2009, BMJ : British Medical Journal.

[10]  J. Pearl,et al.  Confounding and Collapsibility in Causal Inference , 1999 .

[11]  S. Hui,et al.  Evaluation of diagnostic tests without gold standards , 1998, Statistical methods in medical research.

[12]  John P A Ioannidis,et al.  Overinterpretation of clinical applicability in molecular diagnostic research. , 2009, Clinical chemistry.

[13]  Hester F. Lingsma,et al.  Effects of Glasgow Outcome Scale misclassification on traumatic brain injury clinical trials. , 2008, Journal of neurotrauma.

[14]  C M Rutter,et al.  A hierarchical regression approach to meta‐analysis of diagnostic test accuracy evaluations , 2001, Statistics in medicine.

[15]  Karel G M Moons,et al.  Aggregating published prediction models with individual participant data: a comparison of different approaches , 2012, Statistics in medicine.

[16]  G. Guyatt,et al.  Risk Prediction Models for Mortality in Ambulatory Patients With Heart Failure: A Systematic Review , 2013, Circulation. Heart failure.

[17]  Ian Roberts,et al.  Systematic review of prognostic models in traumatic brain injury , 2006, BMC Medical Informatics Decis. Mak..

[18]  Yvonne Vergouwe,et al.  Bmc Medical Research Methodology Open Access Advantages of the Nested Case-control Design in Diagnostic Research , 2022 .

[19]  Douglas G. Altman,et al.  Adequate sample size for developing prediction models is not simply related to events per variable , 2016, Journal of clinical epidemiology.

[20]  S G Pauker,et al.  Pathology and probabilities: a new approach to interpreting and reporting biopsies. , 1981, The New England journal of medicine.

[21]  Karel G M Moons,et al.  A new framework to enhance the interpretation of external validation studies of clinical prediction models. , 2015, Journal of clinical epidemiology.

[22]  A. Evans,et al.  Translating Clinical Research into Clinical Practice: Impact of Using Prediction Rules To Make Decisions , 2006, Annals of Internal Medicine.

[23]  Yvonne Vergouwe,et al.  Development and validation of a prediction model with missing predictor data: a practical approach. , 2010, Journal of clinical epidemiology.

[24]  Mithat Gönen,et al.  A new concordance measure for risk prediction models in external validation settings , 2016, Statistics in medicine.

[25]  Yvonne Vergouwe,et al.  A calibration hierarchy for risk models was defined: from utopia to empirical data. , 2016, Journal of clinical epidemiology.

[26]  Gary S Collins,et al.  Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and Elaboration , 2015, Annals of Internal Medicine.

[27]  Patrik Magnusson,et al.  American Journal of Epidemiology Practice of Epidemiology Risk Prediction Measures for Case-cohort and Nested Case-control Designs: an Application to Cardiovascular Disease , 2022 .

[28]  G. Collins,et al.  Developing risk prediction models for type 2 diabetes: a systematic review of methodology and reporting , 2011, BMC medicine.

[29]  D. Ragland,et al.  Dichotomizing Continuous Outcome Variables: Dependence of the Magnitude of Association and Statistical Power on the Cutpoint , 1992, Epidemiology.

[30]  Erik Bischoff,et al.  Multidimensional prognostic indices for use in COPD patient care. A systematic review , 2011, Respiratory research.

[31]  Richard D Riley,et al.  Developing and validating risk prediction models in an individual participant data meta-analysis , 2014, BMC Medical Research Methodology.

[32]  A. Laupacis,et al.  Clinical prediction rules. A review and suggested modifications of methodological standards. , 1997, JAMA.

[33]  Ewout W. Steyerberg,et al.  Prognostic Models With Competing Risks: Methods and Application to Coronary Risk Prediction , 2009, Epidemiology.

[34]  Richard D Riley,et al.  Explicit inclusion of treatment in prognostic modeling was recommended in observational and randomized settings. , 2016, Journal of clinical epidemiology.

[35]  Johannes B Reitsma,et al.  Evidence of bias and variation in diagnostic accuracy studies , 2006, Canadian Medical Association Journal.

[36]  M. Leeflang,et al.  Bias in sensitivity and specificity caused by data-driven selection of optimal cutoff values: mechanisms, magnitude, and solutions. , 2008, Clinical chemistry.

[37]  Karel G M Moons,et al.  Imputation of systematically missing predictors in an individual participant data meta‐analysis: a generalized approach using MICE , 2015, Statistics in medicine.

[38]  Patrick Royston,et al.  Reporting methods in studies developing prognostic models in cancer: a review , 2010, BMC medicine.

[39]  Stephen B. Gruber,et al.  Clinical Epidemiology: The Architecture of Clinical Research , 1986, The Yale Journal of Biology and Medicine.

[40]  Bas van Zaane,et al.  Comparison of approaches to estimate confidence intervals of post-test probabilities of diagnostic test results in a nested case-control study , 2012, BMC Medical Research Methodology.

[41]  R A Greenes,et al.  The influence of uninterpretability on the assessment of diagnostic tests. , 1986, Journal of chronic diseases.

[42]  Richard D Riley,et al.  Prognosis research: toward evidence-based results and a Cochrane methods group. , 2007, Journal of clinical epidemiology.

[43]  Maarten Keijzer,et al.  Development and validation of clinical prediction models: marginal differences between logistic regression, penalized maximum likelihood estimation, and genetic programming. , 2012, Journal of clinical epidemiology.

[44]  Richard D Riley,et al.  External validation of clinical prediction models using big datasets from e-health records or IPD meta-analysis: opportunities and challenges , 2016, BMJ.

[45]  Johannes B. Reitsma,et al.  Use of Expert Panels to Define the Reference Standard in Diagnostic Research: A Systematic Review of Published Methods and Reporting , 2013, PLoS medicine.

[46]  Johannes B Reitsma,et al.  American Journal of Epidemiology Practice of Epidemiology Adjusting for Partial Verification or Workup Bias in Meta-analyses of Diagnostic Accuracy Studies , 2022 .

[47]  K. Covinsky,et al.  Assessing the Generalizability of Prognostic Information , 1999, Annals of Internal Medicine.

[48]  Richard D Riley,et al.  A framework for meta-analysis of prediction model studies with binary and time-to-event outcomes , 2018, Statistical methods in medical research.

[49]  J. Higgins,et al.  Cochrane Handbook for Systematic Reviews of Interventions , 2010, International Coaching Psychology Review.

[50]  Gary S Collins,et al.  Quantifying the impact of different approaches for handling continuous predictors on the performance of a prognostic model , 2016, Statistics in medicine.

[51]  K. Bhaskaran,et al.  Data Resource Profile: Clinical Practice Research Datalink (CPRD) , 2015, International journal of epidemiology.

[52]  G. Bedogni,et al.  Clinical Prediction Models—a Practical Approach to Development, Validation and Updating , 2009 .

[53]  S Van Huffel,et al.  Does ignoring clustering in multicenter data influence the performance of prediction models? A simulation study , 2018, Statistical methods in medical research.

[54]  G W Sun,et al.  Inappropriate use of bivariable analysis to screen risk factors for use in multivariable analysis. , 1996, Journal of clinical epidemiology.

[55]  H. Boshuizen,et al.  Multiple imputation of missing blood pressure covariates in survival analysis. , 1999, Statistics in medicine.

[56]  Bruce K Armstrong,et al.  Risk prediction models for incident primary cutaneous melanoma: a systematic review. , 2014, JAMA dermatology.

[57]  Allen F. Shaughnessy,et al.  Clinical Epidemiology: A Basic Science for Clinical Medicine , 2007, BMJ : British Medical Journal.

[58]  J. Sterne,et al.  The Cochrane Collaboration’s tool for assessing risk of bias in randomised trials , 2011, BMJ : British Medical Journal.

[59]  Yemisi Takwoingi,et al.  Empirical Evidence of the Importance of Comparative Studies of Diagnostic Test Accuracy , 2013, Annals of Internal Medicine.

[60]  J. Habbema,et al.  Prognostic Modeling with Logistic Regression Analysis , 2001, Medical decision making : an international journal of the Society for Medical Decision Making.

[61]  Charles E McCulloch,et al.  Relaxing the rule of ten events per variable in logistic and Cox regression. , 2007, American journal of epidemiology.

[62]  Patrick Royston,et al.  The cost of dichotomising continuous variables , 2006, BMJ : British Medical Journal.

[63]  Karel G M Moons,et al.  Meta‐analysis and aggregation of multiple published prediction models , 2014, Statistics in medicine.

[64]  et al.,et al.  Framework for the impact analysis and implementation of Clinical Prediction Rules (CPRs) , 2011, BMC Medical Informatics Decis. Mak..

[65]  Gert Kwakkel,et al.  Early Prediction of Outcome of Activities of Daily Living After Stroke: A Systematic Review , 2011, Stroke.

[66]  S. Peters,et al.  Improvements in risk stratification for the occurrence of cardiovascular disease by imaging subclinical atherosclerosis: a systematic review , 2011, Heart.

[67]  H. Hemingway Prognosis research: why is Dr. Lydgate still waiting? , 2006, Journal of clinical epidemiology.

[68]  M. Kenward,et al.  Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls , 2009, BMJ : British Medical Journal.

[69]  E. Steyerberg,et al.  Reporting and Methods in Clinical Prediction Research: A Systematic Review , 2012, PLoS medicine.

[70]  Karel Moons,et al.  The Wells Rule Does Not Adequately Rule Out Deep Venous Thrombosis in Primary Care Patients , 2005, Annals of Internal Medicine.

[71]  John P A Ioannidis,et al.  Comparisons of established risk prediction models for cardiovascular disease: systematic review , 2012, BMJ : British Medical Journal.

[72]  E. Steyerberg,et al.  Prognosis Research Strategy (PROGRESS) 3: Prognostic Model Research , 2013, PLoS medicine.

[73]  K J M Janssen,et al.  Multiple imputation to correct for partial verification bias revisited , 2008, Statistics in medicine.

[74]  H. Sox,et al.  Clinical prediction rules. Applications and methodological standards. , 1985, The New England journal of medicine.

[75]  Patrick Royston,et al.  Multiple imputation using chained equations: Issues and guidance for practice , 2011, Statistics in medicine.

[76]  Philip A. Whiting,et al.  9 Assessing Methodological Quality , 2009 .

[77]  Thomas Agoritsas,et al.  Performance of logistic regression modeling: beyond the number of events per variable, the role of data structure. , 2011, Journal of clinical epidemiology.

[78]  Gary S Collins,et al.  Prognostic models in obstetrics: available, but far from applicable. , 2016, American journal of obstetrics and gynecology.

[79]  Isabelle Boutron,et al.  A revised tool for assessing risk of bias in randomized trials , 2016 .

[80]  B. McNeil,et al.  Assessment of radiologic tests: control of bias and other design considerations. , 1988, Radiology.

[81]  T. Therneau,et al.  Assessing calibration of prognostic risk scores , 2016, Statistical methods in medical research.

[82]  E. Steyerberg,et al.  Prognosis Research Strategy (PROGRESS) 2: Prognostic Factor Research , 2013, PLoS medicine.

[83]  G. Guyatt,et al.  Users ' Guides to the Medical Literature : III . How to Use an Article About a Diagnostic Test : A . Are the Results of the Study Valid ? , 2022 .

[84]  M. Dennis,et al.  Systematic Review of Prognostic Models in Patients with Acute Stroke , 2001, Cerebrovascular Diseases.

[85]  Richard D Riley,et al.  Minimum sample size for developing a multivariable prediction model: PART II ‐ binary and time‐to‐event outcomes , 2018, Statistics in medicine.

[86]  G. Collins,et al.  Critical Appraisal and Data Extraction for Systematic Reviews of Prediction Modelling Studies: The CHARMS Checklist , 2014, PLoS medicine.

[87]  Loes C M Bertens,et al.  Value of composite reference standards in diagnostic research , 2013, BMJ.

[88]  Rolf H H Groenwold,et al.  Performance of the original EuroSCORE. , 2012, European journal of cardio-thoracic surgery : official journal of the European Association for Cardio-thoracic Surgery.

[89]  G. Collins,et al.  PROBAST: A Tool to Assess the Risk of Bias and Applicability of Prediction Model Studies , 2019, Annals of Internal Medicine.

[90]  Diederick E Grobbee,et al.  When should we remain blind and when should our eyes remain open in diagnostic studies? , 2002, Journal of clinical epidemiology.

[91]  C B Begg,et al.  Biases in the assessment of diagnostic tests. , 1987, Statistics in medicine.

[92]  Patrick Royston,et al.  Multivariable Model-Building: A Pragmatic Approach to Regression Analysis based on Fractional Polynomials for Modelling Continuous Variables , 2008 .

[93]  D. van der A,et al.  The validation of cardiovascular risk scores for patients with type 2 diabetes mellitus , 2014, Heart.

[94]  T. Stijnen,et al.  Review: a gentle introduction to imputation of missing values. , 2006, Journal of clinical epidemiology.

[95]  C. Linkletter,et al.  Development of a cardiovascular risk score for use in low- and middle-income countries. , 2011, The Journal of nutrition.

[96]  Sunil J Rao,et al.  Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis , 2003 .

[97]  M. Hernán,et al.  ROBINS-I: a tool for assessing risk of bias in non-randomised studies of interventions , 2016, British Medical Journal.

[98]  Douglas G. Altman,et al.  No rationale for 1 variable per 10 events criterion for binary logistic regression analysis , 2016, BMC Medical Research Methodology.

[99]  A. Feinstein,et al.  Problems of spectrum and bias in evaluating the efficacy of diagnostic tests. , 1978, The New England journal of medicine.

[100]  A. Marmarou,et al.  Impact of GOS misclassification on ordinal outcome analysis of traumatic brain injury clinical trials. , 2012, Journal of neurotrauma.

[101]  A. Hussain,et al.  Predicting length of stay in thermal burns: a systematic review of prognostic factors. , 2013, Burns : journal of the International Society for Burn Injuries.

[102]  Fiona Lecky,et al.  Predicting early death in patients with traumatic bleeding: development and validation of prognostic model , 2012, BMJ : British Medical Journal.

[103]  Constantine Gatsonis,et al.  Analysing and Presenting Results , 2010 .

[104]  C. Bombardier,et al.  Assessing Bias in Studies of Prognostic Factors , 2013, Annals of Internal Medicine.

[105]  Richard D Riley,et al.  Performance of methods for meta-analysis of diagnostic test accuracy with few studies or sparse data , 2015, Statistical methods in medical research.

[106]  Yvonne Vergouwe,et al.  Adaptation of Clinical Prediction Models for Application in Local Settings , 2012, Medical decision making : an international journal of the Society for Medical Decision Making.

[107]  F. Harrell,et al.  Prognostic/Clinical Prediction Models: Multivariable Prognostic Models: Issues in Developing Models, Evaluating Assumptions and Adequacy, and Measuring and Reducing Errors , 2005 .

[108]  M. Hlatky Evaluation of diagnostic tests. , 1986, Journal of chronic diseases.

[109]  Haitao Chu,et al.  A unification of models for meta-analysis of diagnostic accuracy studies. , 2009, Biostatistics.

[110]  M. Gail,et al.  Biased estimates of treatment effect in randomized experiments with nonlinear regressions and omitted covariates , 1984 .

[111]  P. Austin,et al.  Events per variable (EPV) and the relative performance of different strategies for estimating the out-of-sample validity of logistic regression models , 2014, Statistical methods in medical research.

[112]  J. Schafer Multiple imputation: a primer , 1999, Statistical methods in medical research.

[113]  Olli Saarela,et al.  Epidemiologic Perspectives & Innovations Open Access Case-cohort Design in Practice – Experiences from the Morgam Project , 2007 .

[114]  Ralph B. D'Agostino,et al.  Evaluation of the Performance of Survival Analysis Models: Discrimination and Calibration Measures , 2003, Advances in Survival Analysis.

[115]  R A Greenes,et al.  Assessment of diagnostic tests when disease verification is subject to selection bias. , 1983, Biometrics.

[116]  A R Feinstein,et al.  The impact of clinical history on mammographic interpretations. , 1997, JAMA.

[117]  Ewout W Steyerberg,et al.  Modern modelling techniques are data hungry: a simulation study for predicting dichotomous endpoints , 2014, BMC Medical Research Methodology.

[118]  J. Knottnerus,et al.  Assessment of the accuracy of diagnostic tests: the cross-sectional study. , 2003, Journal of clinical epidemiology.

[119]  Patrick M M Bossuyt,et al.  Prediction models in reproductive medicine: a critical appraisal. , 2009, Human reproduction update.

[120]  Lucas M. Bachmann,et al.  Clinical Value of Prognostic Instruments to Identify Patients with an Increased Risk for Osteoporotic Fractures: Systematic Review , 2011, PloS one.

[121]  Richard D Riley,et al.  Unexpected predictor–outcome associations in clinical prediction research: causes and solutions , 2013, Canadian Medical Association Journal.

[122]  Ewout W Steyerberg,et al.  Internal and external validation of predictive models: a simulation study of bias and precision in small samples. , 2003, Journal of clinical epidemiology.

[123]  Richard D Riley,et al.  Multilevel mixed effects parametric survival models using adaptive Gauss–Hermite quadrature with application to recurrent events and individual participant data meta‐analysis , 2014, Statistics in medicine.

[124]  M. Woodward,et al.  Risk prediction models: II. External validation, model updating, and impact assessment , 2012, Heart.

[125]  Richard D Riley,et al.  Multivariate meta-analysis of individual participant data helped externally validate the performance and implementation of a prediction model , 2016, Journal of clinical epidemiology.

[126]  Qingxia Chen,et al.  Dealing with missing predictor values when applying clinical prediction models. , 2009, Clinical chemistry.

[127]  Yvonne Vergouwe,et al.  External validity of risk models: Use of benchmark values to disentangle a case-mix effect from incorrect coefficients. , 2010, American journal of epidemiology.

[128]  Henry S. Sacks Book ReviewClinical Epidemiology: The architecture of clinical research , 1986 .

[129]  Ewout W Steyerberg,et al.  Extensions of net reclassification improvement calculations to measure usefulness of new biomarkers , 2011, Statistics in medicine.

[130]  M. P. Koster,et al.  External validation of prognostic models to predict risk of gestational diabetes mellitus in one Dutch cohort: prospective multicentre cohort study , 2016, British Medical Journal.

[131]  M. Egger,et al.  The hazards of scoring the quality of clinical trials for meta-analysis. , 1999, JAMA.

[132]  D. Moher,et al.  Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. , 2010, International journal of surgery.

[133]  N. Obuchowski,et al.  Assessing the Performance of Prediction Models: A Framework for Traditional and Novel Measures , 2010, Epidemiology.

[134]  David Moher,et al.  Preferred Reporting Items for a Systematic Review and Meta-analysis of Diagnostic Test Accuracy Studies: The PRISMA-DTA Statement , 2018, JAMA.

[135]  W. Sauerbrei,et al.  Dangers of using "optimal" cutpoints in the evaluation of prognostic factors. , 1994, Journal of the National Cancer Institute.

[136]  Richard D Riley,et al.  Prognosis research strategy (PROGRESS) 4: Stratified medicine research , 2013, BMJ : British Medical Journal.

[137]  Yvonne Vergouwe,et al.  Prognosis and prognostic research: Developing a prognostic model , 2009, BMJ : British Medical Journal.

[138]  P. Royston,et al.  Prognosis and prognostic research: application and impact of prognostic models in clinical practice , 2009, BMJ : British Medical Journal.

[139]  H C Sox,et al.  Probability theory in the use of diagnostic tests. An introduction to critical study of the literature. , 1986, Annals of internal medicine.

[140]  Haitao Chu,et al.  Bivariate Random Effects Meta-Analysis of Diagnostic Studies Using Generalized Linear Mixed Models , 2010, Medical decision making : an international journal of the Society for Medical Decision Making.

[141]  Richard Simon,et al.  Bias in error estimation when using cross-validation for model selection , 2006, BMC Bioinformatics.

[142]  Yvonne Vergouwe,et al.  Substantial effective sample sizes were required for external validation studies of predictive logistic regression models. , 2005, Journal of clinical epidemiology.

[143]  G. Collins,et al.  Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): The TRIPOD Statement , 2015, Annals of Internal Medicine.

[144]  N. King,et al.  External validation of the CRASH and IMPACT prognostic models in severe traumatic brain injury. , 2014, Journal of Neurotrauma.

[145]  S D Walter,et al.  Estimation of test sensitivity and specificity when disease confirmation is limited to positive results. , 1999, Epidemiology.

[146]  Douglas G Altman,et al.  Combining estimates of interest in prognostic modelling studies after multiple imputation: current practice and guidelines , 2009, BMC medical research methodology.

[147]  Richard D Riley,et al.  Prognosis research strategy (PROGRESS) 1: A framework for researching clinical outcomes , 2013, BMJ : British Medical Journal.

[148]  D. G. Altman,et al.  Statistical aspects of prognostic factor studies in oncology. , 1994, British Journal of Cancer.

[149]  J. Reitsma,et al.  Latent class models in diagnostic studies when there is no reference standard--a systematic review. , 2014, American journal of epidemiology.

[150]  Rachel Churchill,et al.  ROBIS: A new tool to assess risk of bias in systematic reviews was developed , 2016, Journal of clinical epidemiology.

[151]  J. Ioannidis,et al.  Assessment of claims of improved prediction beyond the Framingham risk score. , 2009, JAMA.

[152]  E. Elkin,et al.  Decision Curve Analysis: A Novel Method for Evaluating Prediction Models , 2006, Medical decision making : an international journal of the Society for Medical Decision Making.

[153]  L. Tamariz,et al.  Usefulness of clinical prediction rules for the diagnosis of venous thromboembolism: a systematic review. , 2004, The American journal of medicine.

[154]  Karel G M Moons,et al.  Missing covariate data in clinical research: when and when not to use the missing-indicator method for analysis , 2012, Canadian Medical Association Journal.

[155]  Victor M Montori,et al.  Synthesizing evidence: shifting the focus from individual studies to the body of evidence. , 2013, JAMA.

[156]  John P. A. Ioannidis,et al.  An empirical assessment of validation practices for molecular classifiers , 2011, Briefings Bioinform..

[157]  Theo Stijnen,et al.  Using the outcome for imputation of missing predictor values was preferred. , 2006, Journal of clinical epidemiology.

[158]  J. Ioannidis,et al.  The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration. , 2009, Journal of clinical epidemiology.

[159]  J. Concato,et al.  Importance of events per independent variable in proportional hazards regression analysis. II. Accuracy and precision of regression estimates. , 1995, Journal of clinical epidemiology.

[160]  Lena Osterhagen,et al.  Multiple Imputation For Nonresponse In Surveys , 2016 .

[161]  A. Abu-Hanna,et al.  Prediction of Mortality in Very Premature Infants: A Systematic Review of Prediction Models , 2011, PloS one.

[162]  B. J. Ingui,et al.  Searching for clinical prediction rules in MEDLINE. , 2001, Journal of the American Medical Informatics Association : JAMIA.

[163]  G. Collins,et al.  External validation of multivariable prediction models: a systematic review of methodological conduct and reporting , 2014, BMC Medical Research Methodology.

[164]  G. Collins,et al.  PROBAST: A Tool to Assess the Risk of Bias and Applicability of Prediction Model Studies. , 2019, Annals of internal medicine.

[165]  Gary S Collins,et al.  Sample size considerations for the external validation of a multivariable prognostic model: a resampling study , 2015, Statistics in medicine.

[166]  D. Sackett,et al.  The Ends of Human Life: Medical Ethics in a Liberal Polity , 1992, Annals of Internal Medicine.

[167]  Johannes B Reitsma,et al.  Case-control and two-gate designs in diagnostic accuracy studies. , 2005, Clinical chemistry.

[168]  M. Leeflang,et al.  Search Filters for Finding Prognostic and Diagnostic Prediction Studies in Medline to Enhance Systematic Reviews , 2012, PloS one.

[169]  Maarten van Smeden,et al.  Evaluating Diagnostic Accuracy in the Face of Multiple Reference Standards , 2013, Annals of Internal Medicine.

[170]  Yvonne Vergouwe,et al.  Prognosis and prognostic research: what, why, and how? , 2009, BMJ : British Medical Journal.

[171]  R. Perera,et al.  Diagnostic accuracy studies: how to report and analyse inconclusive test results , 2013, BMJ.

[172]  P. Bossuyt,et al.  Evaluation of diagnostic tests when there is no gold standard. A review of methods. , 2007, Health technology assessment.

[173]  Nicole A. Lazar,et al.  Statistical Analysis With Missing Data , 2003, Technometrics.

[174]  J A Swets,et al.  Measuring the accuracy of diagnostic systems. , 1988, Science.

[175]  Thomas Ulahannan The Evidence Base of Clinical Diagnosis , 2002 .

[176]  M S Pepe,et al.  Using a combination of reference tests to assess the accuracy of a new diagnostic test. , 1999, Statistics in medicine.

[177]  Penny Whiting,et al.  Bmc Medical Research Methodology Open Access No Role for Quality Scores in Systematic Reviews of Diagnostic Accuracy Studies , 2005 .

[178]  Daniel B. Mark,et al.  TUTORIAL IN BIOSTATISTICS MULTIVARIABLE PROGNOSTIC MODELS: ISSUES IN DEVELOPING MODELS, EVALUATING ASSUMPTIONS AND ADEQUACY, AND MEASURING AND REDUCING ERRORS , 1996 .

[179]  D. Moher,et al.  Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement , 2009, BMJ.

[180]  Gareth Ambler,et al.  How to develop a more accurate risk prediction model when there are few events , 2015, BMJ : British Medical Journal.

[181]  Ørnulf Borgan,et al.  A method for checking regression models in survival analysis based on the risk score , 1996, Lifetime data analysis.

[182]  Richard D. Riley,et al.  A systematic review of breast cancer incidence risk prediction models with meta-analysis of their performance , 2012, Breast Cancer Research and Treatment.

[183]  Richard D Riley,et al.  Minimum sample size for developing a multivariable prediction model: Part I – Continuous outcomes , 2018, Statistics in medicine.

[184]  Ben Ewald,et al.  Post hoc choice of cut points introduced bias to diagnostic research. , 2006, Journal of clinical epidemiology.

[185]  Kristopher J Preacher,et al.  On the practice of dichotomization of quantitative variables. , 2002, Psychological methods.

[186]  Qingxia Chen,et al.  Missing covariate data in medical research: to impute is better than to ignore. , 2010, Journal of clinical epidemiology.

[187]  P. Bossuyt,et al.  Empirical evidence of design-related bias in studies of diagnostic tests. , 1999, JAMA.

[188]  Douglas G Altman,et al.  Dichotomizing continuous predictors in multiple regression: a bad idea , 2006, Statistics in medicine.

[189]  J. Habbema,et al.  Prognostic modelling with logistic regression analysis: a comparison of selection and estimation methods in small data sets. , 2000, Statistics in medicine.

[190]  Joris A H de Groot,et al.  Verification problems in diagnostic accuracy studies: consequences and solutions , 2011, BMJ : British Medical Journal.

[191]  Johannes B Reitsma,et al.  Bivariate analysis of sensitivity and specificity produces informative summary measures in diagnostic reviews. , 2005, Journal of clinical epidemiology.

[192]  A. Dixon,et al.  Measuring the effects of imaging: an evaluative framework. , 1995, Clinical radiology.

[193]  Tom Fahey,et al.  Optimized retrieval of primary care clinical prediction rules from MEDLINE to establish a Web-based register. , 2011, Journal of clinical epidemiology.

[194]  Johannes B. Reitsma,et al.  A review of solutions for diagnostic accuracy studies with an imperfect or missing reference standard. , 2009, Journal of clinical epidemiology.

[195]  Ofer Harel,et al.  Multiple imputation for correcting verification bias , 2006, Statistics in medicine.

[196]  Susan Mallett,et al.  A systematic review classifies sources of bias and variation in diagnostic test accuracy studies. , 2013, Journal of clinical epidemiology.

[197]  F. Harrell,et al.  Regression models in clinical studies: determining relationships between predictors and response. , 1988, Journal of the National Cancer Institute.

[198]  Carol Coupland,et al.  Derivation and validation of updated QFracture algorithm to predict risk of osteoporotic fracture in primary care in the United Kingdom: prospective open cohort study , 2012, BMJ : British Medical Journal.

[199]  Susan Mallett,et al.  Circulating MicroRNAs as a Novel Class of Diagnostic Biomarkers in Gastrointestinal Tumors Detection: A Meta-Analysis Based on 42 Articles , 2014, PloS one.

[200]  J. Concato,et al.  A simulation study of the number of events per variable in logistic regression analysis. , 1996, Journal of clinical epidemiology.

[201]  M. Woodward,et al.  Risk prediction models: I. Development, internal validation, and assessing the incremental value of a new (bio)marker , 2012, Heart.

[202]  J. Bartlett Predicting bacterial cause in infectious conjunctivitis: Cohort study on informativeness of combinations of signs and symptoms , 2004 .

[203]  Ewout W. Steyerberg,et al.  Graphical assessment of internal and external calibration of logistic regression models by using loess smoothers , 2013, Statistics in medicine.

[204]  Jean Sanderson,et al.  Derivation and assessment of risk prediction models using case-cohort data , 2013, BMC Medical Research Methodology.

[205]  R. Brian Haynes,et al.  Developing Optimal Search Strategies for Detecting Sound Clinical Prediction Studies in MEDLINE , 2003, AMIA.

[206]  Douglas G Altman,et al.  Prognostic Models: A Methodological Framework and Review of Models for Breast Cancer , 2009, Cancer investigation.

[207]  Douglas G Altman,et al.  Comparison of techniques for handling missing covariate data within prognostic modelling studies: a simulation study , 2010, BMC medical research methodology.

[208]  E. Mohammadi,et al.  Barriers and facilitators related to the implementation of a physiological track and trigger system: A systematic review of the qualitative evidence , 2017, International journal for quality in health care : journal of the International Society for Quality in Health Care.