Multivariate Classification Rules: Calibration and Discrimination

This article presents a number of measures of discrimination and calibration, along with graphical representations of calibration and discrimination assessment. It emphasizes multivariate classification rules for models, where the classification is into one of two possible states, and also discusses extensions to multistate classifications. The c-index and the Hosmer–Lemeshow χ2 statistic are the most widely used measures of discrimination and calibration. Keywords: discrimination analysis; calibration; receiver-operating characteristic(ROC) curve'; c-index; brier score; Sander's decomposition; Murphy decomposition; Yates decomposition; rank order statistic; Hosmer–Lemeshow chi-square tests; goodness of fit; likelihood ratio test; graphical displays; multistate outcome

[1]  F E Harrell,et al.  The Covariance Decomposition of the Probability Score and Its Use in Evaluating Prognostic Estimates , 1995, Medical decision making : an international journal of the Society for Medical Decision Making.

[2]  A. H. Murphy,et al.  “Good” Probability Assessors , 1968 .

[3]  F. Sanders On Subjective Probability Forecasting , 1963 .

[4]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[5]  D A Redelmeier,et al.  Assessing predictive accuracy: how to compare Brier scores. , 1991, Journal of clinical epidemiology.

[6]  Daniel B. Mark,et al.  TUTORIAL IN BIOSTATISTICS MULTIVARIABLE PROGNOSTIC MODELS: ISSUES IN DEVELOPING MODELS, EVALUATING ASSUMPTIONS AND ADEQUACY, AND MEASURING AND REDUCING ERRORS , 1996 .

[7]  R. D'Agostino,et al.  A logistic regression model when some events precede treatment: the effect of thrombolytic therapy for acute myocardial infarction on the risk of cardiac arrest. , 1997, Journal of Clinical Epidemiology.

[8]  J. Frank Yates,et al.  Analyzing the accuracy of probability judgments for multiple events: An extension of the covariance decomposition , 1988 .

[9]  S L Hui,et al.  Validation techniques for logistic regression models. , 1991, Statistics in medicine.

[10]  D J Spiegelhalter,et al.  Probabilistic prediction in patient management and clinical trials. , 1986, Statistics in medicine.

[11]  G. Brier,et al.  External correspondence: Decompositions of the mean probability score , 1982 .

[12]  R. L. Winkler,et al.  Are two (inexperienced) heads better than one (experienced) head? Averaging house officers' prognostic judgments for critically ill patients. , 1990, Archives of internal medicine.

[13]  J. Habbema,et al.  The measurement of performance in probabilistic diagnosis. II. Trustworthiness of the exact values of the diagnostic probabilities. , 1978, Methods of information in medicine.

[14]  A R Shapiro,et al.  The evaluation of clinical predictions. A method and initial application. , 2010, The New England journal of medicine.

[15]  D. Cox Two further applications of a model for binary regression , 1958 .

[16]  Harry P. Selker,et al.  A Tool for Judging Coronary Care Unit Admission Appropriateness, Valid for Both Real-Time and Retrospective Use: A Time-Insensitive Predictive Instrument (TIPI) for Acute Cardiac Ischemia: A Multicenter Study , 1991, Medical care.

[17]  A. H. Murphy A New Vector Partition of the Probability Score , 1973 .

[18]  G A Diamond,et al.  What price perfection? Calibration and discrimination of clinical prediction models. , 1992, Journal of clinical epidemiology.