Use and Misuse of the Receiver Operating Characteristic Curve in Risk Prediction

The c statistic, or area under the receiver operating characteristic (ROC) curve, achieved popularity in diagnostic testing, in which the test characteristics of sensitivity and specificity are relevant to discriminating diseased versus nondiseased patients. The c statistic, however, may not be optimal in assessing models that predict future risk or stratify individuals into risk categories. In this setting, calibration is as important to the accurate assessment of risk. For example, a biomarker with an odds ratio of 3 may have little effect on the c statistic, yet an increased level could shift estimated 10-year cardiovascular risk for an individual patient from 8% to 24%, which would lead to different treatment recommendations under current Adult Treatment Panel III guidelines. Accepted risk factors such as lipids, hypertension, and smoking have only marginal impact on the c statistic individually yet lead to more accurate reclassification of large proportions of patients into higher-risk or lower-risk categories. Perfectly calibrated models for complex disease can, in fact, only achieve values for the c statistic well below the theoretical maximum of 1. Use of the c statistic for model selection could thus naively eliminate established risk factors from cardiovascular risk prediction scores. As novel risk factors are discovered, sole reliance on the c statistic to evaluate their utility as risk predictors thus seems ill-advised.

[1]  F. Harrell,et al.  Evaluating the yield of medical tests. , 1982, JAMA.

[2]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[3]  F. Harrell,et al.  Factors affecting sensitivity and specificity of exercise electrocardiography. Multivariable analysis. , 1984, The American journal of medicine.

[4]  G Rose,et al.  Sick individuals and sick populations. , 1985, International journal of epidemiology.

[5]  A. Dannenberg,et al.  Blood pressure levels in persons 18-74 years of age in 1976-80, and trends in blood pressure from 1960 to 1980 in the United States. , 1986, Vital and health statistics. Series 11, Data from the National Health Survey.

[6]  J. C. Christiansen,et al.  Determinants of sensitivity and specificity of electrocardiographic criteria for left ventricular hypertrophy. , 1990, Circulation.

[7]  G A Diamond,et al.  What price perfection? Calibration and discrimination of clinical prediction models. , 1992, Journal of clinical epidemiology.

[8]  M. Carroll,et al.  Serum lipids of adults 20-74 years: United States, 1976-80. , 1993, Vital and health statistics. Series 11, Data from the National Health Survey.

[9]  H Brenner,et al.  Variation of sensitivity, specificity, likelihood ratios and predictive values with disease prevalence. , 1997, Statistics in medicine.

[10]  D. Hosmer,et al.  A comparison of goodness-of-fit tests for the logistic regression model. , 1997, Statistics in medicine.

[11]  Diederick E. Grobbee,et al.  Limitations of Sensitivity, Specificity, Likelihood Ratio, and Bayes' Theorem in Assessing Diagnostic Probabilities: A Clinical Example , 1997, Epidemiology.

[12]  D. Levy,et al.  Prediction of coronary heart disease using risk factor categories. , 1998, Circulation.

[13]  E Graf,et al.  Assessment and comparison of prognostic classification schemes for survival data. , 1999, Statistics in medicine.

[14]  H C van Houwelingen,et al.  The (in)validity of sensitivity and specificity. , 2000, Statistics in medicine.

[15]  J. Mckenney,et al.  Executive Summary of The Third Report of The National Cholesterol Education Program (NCEP) Expert Panel on Detection, Evaluation, And Treatment of High Blood Cholesterol In Adults (Adult Treatment Panel III). , 2001, JAMA.

[16]  N. Obuchowski Receiver operating characteristic curves and their use in radiology. , 2003, Radiology.

[17]  F. Harrell,et al.  Sensitivity and specificity should be de-emphasized in diagnostic accuracy studies. , 2003, Academic radiology.

[18]  M. Pepe,et al.  Limitations of the odds ratio in gauging the performance of a diagnostic, prognostic, or screening marker. , 2004, American journal of epidemiology.

[19]  Parental cardiovascular disease as a risk factor for cardiovascular disease in middle-aged adults: a prospective study of parents and offspring , 2004 .

[20]  Ralph B D'Agostino,et al.  Risk of complications of pregnancy in women with type 1 diabetes: nationwide prospective study in the Netherlands , 2004, BMJ : British Medical Journal.

[21]  J. Manson,et al.  A Randomized Trial of Low-Dose Aspirin in the Primary Prevention of Cardiovascular Disease in Women , 2005, The New England journal of medicine.

[22]  Mitchell H Gail,et al.  On criteria for evaluating models of absolute risk. , 2005, Biostatistics.

[23]  P. Greenland,et al.  When is a new prediction marker useful? A consideration of lipoprotein-associated phospholipase A2 and C-reactive protein for stroke risk. , 2005, Archives of internal medicine.

[24]  N. Cook,et al.  Should age and time be eliminated from cardiovascular risk prediction models? Rationale for the creation of a new national risk detection program. , 2005, Circulation.

[25]  The latest and greatest new biomarkers: which ones should we measure for risk prediction in our practice? , 2006, Archives of internal medicine.

[26]  Eric Boerwinkle,et al.  An assessment of incremental coronary risk prediction using C-reactive protein and other novel risk markers: the atherosclerosis risk in communities study. , 2006, Archives of internal medicine.

[27]  S. Yusuf,et al.  Comparative Impact of Multiple Biomarkers and N-Terminal Pro-Brain Natriuretic Peptide in the Context of Conventional Risk Factors for the Prediction of Recurrent Cardiovascular Events in the Heart Outcomes Prevention Evaluation (HOPE) Study , 2006, Circulation.

[28]  J. Ware The limitations of risk factors as prognostic tools. , 2006, The New England journal of medicine.

[29]  Lu Tian,et al.  Predicting cardiovascular risk: so what do we do now? , 2006, Archives of Internal Medicine.

[30]  Lu Tian,et al.  Narrative Review: Assessment of C-Reactive Protein in Risk Prediction for Cardiovascular Disease , 2006, Annals of Internal Medicine.

[31]  D. Levy,et al.  Multiple biomarkers for the prediction of first major cardiovascular events and death. , 2006, The New England journal of medicine.

[32]  Nancy R Cook,et al.  The Effect of Including C-Reactive Protein in Cardiovascular Risk Prediction Models for Women , 2006, Annals of Internal Medicine.

[33]  Yingye Zheng,et al.  Integrating the predictiveness of a marker with its performance as a classifier. , 2007, American journal of epidemiology.