Avoiding pitfalls in applying prediction models, as illustrated by the example of prostate cancer diagnosis.

BACKGROUND The use of different mathematical models to support medical decisions is accompanied by increasing uncertainties when they are applied in practice. Using prostate cancer (PCa) risk models as an example, we recommend requirements for model development and draw attention to possible pitfalls so as to avoid the uncritical use of these models. CONTENT We conducted MEDLINE searches for applications of multivariate models supporting the prediction of PCa risk. We critically reviewed the methodological aspects of model development and the biological and analytical variability of the parameters used for model development. In addition, we reviewed the role of prostate biopsy as the gold standard for confirming diagnoses. In addition, we analyzed different methods of model evaluation with respect to their application to different populations. When using models in clinical practice, one must validate the results with a population from the application field. Typical model characteristics (such as discrimination performance and calibration) and methods for assessing the risk of a decision should be used when evaluating a model's output. The choice of a model should be based on these results and on the practicality of its use. SUMMARY To avoid possible errors in applying prediction models (the risk of PCa, for example) requires examining the possible pitfalls of the underlying mathematical models in the context of the individual case. The main tools for this purpose are discrimination, calibration, and decision curve analysis.

[1]  Y. Choi,et al.  Interobserver variability of transrectal ultrasound for prostate volume measurement according to volume and observer experience. , 2009, AJR. American journal of roentgenology.

[2]  W. Catalona,et al.  Interexaminer variability of digital rectal examination in detecting prostate cancer. , 1995, Urology.

[3]  Mesut Remzi,et al.  Artificial neural networks for decision-making in urologic oncology. , 2003, European urology.

[4]  C. Roehrborn,et al.  Interexaminer reliability of transrectal ultrasound for estimating prostate volume. , 2001, The Journal of urology.

[5]  E. Steyerberg,et al.  Prediction of prostate cancer in unscreened men: external validation of a risk calculator. , 2011, European journal of cancer.

[6]  B. Guillonneau Ceteris paribus and nomograms in medicine. , 2007, European urology.

[7]  J. Trachtenberg,et al.  Variation in patterns of practice in diagnosing screen‐detected prostate cancer , 2004, BJU international.

[8]  Dietmar Schnorr,et al.  Interchangeability of measurements of total and free prostate-specific antigen in serum with 5 frequently used assay combinations: an update. , 2006, Clinical chemistry.

[9]  H. Cammann,et al.  An artificial neural network for five different assay systems of prostate‐specific antigen in prostate cancer diagnostics , 2008, BJU international.

[10]  T. Ichikawa,et al.  Development of a new nomogram for predicting the probability of a positive initial prostate biopsy in Japanese patients with serum PSA levels less than 10 ng/mL , 2008, International journal of urology : official journal of the Japanese Urological Association.

[11]  M. Kattan,et al.  The comparability of models for predicting the risk of a positive prostate biopsy with prostate-specific antigen alone: a systematic review. , 2008, European urology.

[12]  Anssi Auvinen,et al.  Algorithms based on prostate‐specific antigen (PSA), free PSA, digital rectal examination and prostate volume reduce false‐postitive PSA results in prostate cancer screening , 2004, International journal of cancer.

[13]  H. Cammann,et al.  Internal validation of an artificial neural network for prostate biopsy outcome , 2010, International journal of urology : official journal of the Japanese Urological Association.

[14]  N. Obuchowski,et al.  Assessing the Performance of Prediction Models: A Framework for Traditional and Novel Measures , 2010, Epidemiology.

[15]  E. Elkin,et al.  Decision Curve Analysis: A Novel Method for Evaluating Prediction Models , 2006, Medical decision making : an international journal of the Society for Medical Decision Making.

[16]  R. G. Das,et al.  Reference reagents for prostate-specific antigen (PSA): establishment of the first international standards for free PSA and PSA (90:10) , 2000, Clinical chemistry.

[17]  Thomas Tolxdorff,et al.  Classification Models for Early Detection of Prostate Cancer , 2008, Journal of biomedicine & biotechnology.

[18]  N. Obuchowski ROC analysis. , 2005, AJR. American journal of roentgenology.

[19]  D. Ornstein,et al.  Biological variation of total, free and percent free serum prostate specific antigen levels in screening volunteers. , 1997, The Journal of urology.

[20]  D M Rodvold,et al.  Validation and regulation of medical neural networks. , 2001, Molecular urology.

[21]  M. Blankenstein,et al.  Comparison of 6 automated assays for total and free prostate-specific antigen with special reference to their reactivity toward the WHO 96/670 reference preparation. , 2006, Clinical chemistry.

[22]  J. Patard,et al.  Prostate cancer detection rate in patients with repeated extended 21-sample needle biopsy. , 2009, European urology.

[23]  Graham R.D. Jones,et al.  Critical difference calculations revised: inclusion of variation in standard deviation with analyte concentration , 2009, Annals of clinical biochemistry.

[24]  András Kocsor,et al.  ROC analysis: applications to the classification of biological sequences and 3D structures , 2008, Briefings Bioinform..

[25]  P. H. Petersen,et al.  Biological variation of total prostate-specific antigen: a survey of published estimates and consequences for clinical practice. , 2005, Clinical chemistry.

[26]  P. Snow,et al.  Introduction to artificial neural networks for physicians: Taking the lid off the black box , 2001, The Prostate.

[27]  M. Kattan,et al.  What is a real nomogram? , 2010, Seminars in oncology.

[28]  P. Kantoff,et al.  Predicting outcomes in prostate cancer: how many more nomograms do we need? , 2007, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[29]  Lucila Ohno-Machado,et al.  Logistic regression and artificial neural network classification models: a methodology review , 2002, J. Biomed. Informatics.

[30]  T. Ichikawa,et al.  External validation and head‐to‐head comparison of Japanese and Western prostate biopsy nomograms using Japanese data sets , 2009, International journal of urology : official journal of the Japanese Urological Association.

[31]  A. Haese*,et al.  Head-to-head comparison of the three most commonly used preoperative models for prediction of biochemical recurrence after radical prostatectomy. , 2010, European urology.

[32]  O. Halvorsen,et al.  Predictors of prostate cancer evaluated by receiver operating characteristics partial area index: a prospective institutional study. , 2005, The Journal of urology.

[33]  H. Cammann,et al.  Between-method differences in prostate-specific antigen assays affect prostate cancer risk prediction by nomograms. , 2011, Clinical chemistry.

[34]  W. Vach,et al.  On the misuses of artificial neural networks for prognostic and diagnostic classification in oncology. , 2000, Statistics in medicine.

[35]  M. Roobol,et al.  The interobserver variability of digital rectal examination in a large randomized trial for the screening of prostate cancer , 2008, The Prostate.

[36]  P Finne,et al.  Predicting the outcome of prostate biopsy in screen-positive men by a multilayer perceptron network. , 2000, Urology.

[37]  Nancy A Obuchowski,et al.  Clinical evaluation of diagnostic tests. , 2005, AJR. American journal of roentgenology.

[38]  N. Cook Statistical evaluation of prognostic versus diagnostic models: beyond the ROC curve. , 2008, Clinical chemistry.

[39]  M. Kattan Factors affecting the accuracy of prediction models limit the comparison of rival prediction models when applied to separate data sets. , 2011, European urology.

[40]  Ewout W Steyerberg,et al.  Decision Curve Analysis: A Discussion , 2008, Medical decision making : an international journal of the Society for Medical Decision Making.

[41]  P. Scardino,et al.  Critical review of prostate cancer predictive tools. , 2009, Future oncology.

[42]  Kazutaka Saito,et al.  Development, validation, and head-to-head comparison of logistic regression-based nomograms and artificial neural network models predicting prostate cancer on initial extended biopsy. , 2008, European urology.

[43]  T. Peters,et al.  Determination of prostatic volume with transrectal ultrasound: A study of intra-observer and interobserver variation. , 1996, The Journal of urology.