Multivariate Classification Rules: Calibration and Discrimination

This article presents a number of measures of discrimination and calibration, along with graphical representations of calibration and discrimination assessment. It emphasizes multivariate classification rules for models, where the classification is into one of two possible states, and also discusses extensions to multistate classifications. The c-index and the Hosmer–Lemeshow χ2 statistic are the most widely used measures of discrimination and calibration. Keywords: discrimination analysis; calibration; receiver-operating characteristic(ROC) curve'; c-index; brier score; Sander's decomposition; Murphy decomposition; Yates decomposition; rank order statistic; Hosmer–Lemeshow chi-square tests; goodness of fit; likelihood ratio test; graphical displays; multistate outcome

[1]  D A Redelmeier,et al.  Assessing predictive accuracy: how to compare Brier scores. , 1991, Journal of clinical epidemiology.

[2]  D J Spiegelhalter,et al.  Probabilistic prediction in patient management and clinical trials. , 1986, Statistics in medicine.

[3]  R. L. Winkler,et al.  Are two (inexperienced) heads better than one (experienced) head? Averaging house officers' prognostic judgments for critically ill patients. , 1990, Archives of internal medicine.

[4]  S L Hui,et al.  Validation techniques for logistic regression models. , 1991, Statistics in medicine.

[5]  J. Frank Yates,et al.  Analyzing the accuracy of probability judgments for multiple events: An extension of the covariance decomposition , 1988 .

[6]  Daniel B. Mark,et al.  TUTORIAL IN BIOSTATISTICS MULTIVARIABLE PROGNOSTIC MODELS: ISSUES IN DEVELOPING MODELS, EVALUATING ASSUMPTIONS AND ADEQUACY, AND MEASURING AND REDUCING ERRORS , 1996 .

[7]  D. Cox Two further applications of a model for binary regression , 1958 .

[8]  F E Harrell,et al.  The Covariance Decomposition of the Probability Score and Its Use in Evaluating Prognostic Estimates , 1995, Medical decision making : an international journal of the Society for Medical Decision Making.

[9]  A. H. Murphy,et al.  “Good” Probability Assessors , 1968 .

[10]  F. Sanders On Subjective Probability Forecasting , 1963 .

[11]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[12]  G. Brier,et al.  External correspondence: Decompositions of the mean probability score , 1982 .

[13]  R. D'Agostino,et al.  A logistic regression model when some events precede treatment: the effect of thrombolytic therapy for acute myocardial infarction on the risk of cardiac arrest. , 1997, Journal of Clinical Epidemiology.

[14]  Harry P. Selker,et al.  A Tool for Judging Coronary Care Unit Admission Appropriateness, Valid for Both Real-Time and Retrospective Use: A Time-Insensitive Predictive Instrument (TIPI) for Acute Cardiac Ischemia: A Multicenter Study , 1991, Medical care.

[15]  A. H. Murphy A New Vector Partition of the Probability Score , 1973 .

[16]  G A Diamond,et al.  What price perfection? Calibration and discrimination of clinical prediction models. , 1992, Journal of clinical epidemiology.

[17]  J. Habbema,et al.  The measurement of performance in probabilistic diagnosis. II. Trustworthiness of the exact values of the diagnostic probabilities. , 1978, Methods of information in medicine.