Diagnostic accuracy and receiver-operating characteristics curve analysis in surgical research and decision making.

In surgical research, the ability to correctly classify one type of condition or specific outcome from another is of great importance for variables influencing clinical decision making. Receiver-operating characteristic (ROC) curve analysis is a useful tool in assessing the diagnostic accuracy of any variable with a continuous spectrum of results. In order to rule a disease state in or out with a given test, the test results are usually binary, with arbitrarily chosen cut-offs for defining disease versus health, or for grading of disease severity. In the postgenomic era, the translation from bench-to-bedside of biomarkers in various tissues and body fluids requires appropriate tools for analysis. In contrast to predetermining a cut-off value to define disease, the advantages of applying ROC analysis include the ability to test diagnostic accuracy across the entire range of variable scores and test outcomes. In addition, ROC analysis can easily examine visual and statistical comparisons across tests or scores. ROC is also favored because it is thought to be independent from the prevalence of the condition under investigation. ROC analysis is used in various surgical settings and across disciplines, including cancer research, biomarker assessment, imaging evaluation, and assessment of risk scores.With appropriate use, ROC curves may help identify the most appropriate cutoff value for clinical and surgical decision making and avoid confounding effects seen with subjective ratings. ROC curve results should always be put in perspective, because a good classifier does not guarantee the expected clinical outcome. In this review, we discuss the fundamental roles, suggested presentation, potential biases, and interpretation of ROC analysis in surgical research.

[1]  P. Bossuyt,et al.  Empirical evidence of design-related bias in studies of diagnostic tests. , 1999, JAMA.

[2]  F. Crea,et al.  Heart-kidney interactions in ischemic syndromes. , 2004, Circulation.

[3]  H. Nielsen,et al.  Diagnostic Accuracy of C-reactive Protein for Intraabdominal Infections After Colorectal Resections , 2009, Journal of Gastrointestinal Surgery.

[4]  J A Swets,et al.  Better decisions through science. , 2000, Scientific American.

[5]  Lori E. Dodd,et al.  Partial AUC Estimation and Regression , 2003, Biometrics.

[6]  Les Irwig,et al.  Accuracy and surgical impact of magnetic resonance imaging in breast cancer staging: systematic review and meta-analysis in detection of multifocal and multicentric cancer. , 2008, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[7]  Nancy A Obuchowski,et al.  An ROC‐type measure of diagnostic accuracy when the gold standard is continuous‐scale , 2006, Statistics in medicine.

[8]  C. Naylor,et al.  Do we know what inappropriate laboratory utilization is? A systematic review of laboratory clinical audits. , 1998, JAMA.

[9]  Jialiang Li,et al.  ROC analysis with multiple classes and multiple tests: methodology and its application in microarray studies. , 2008, Biostatistics.

[10]  Benjamin Reiser,et al.  mROC: a computer program for combining tumour markers in predicting disease states , 2001, Comput. Methods Programs Biomed..

[11]  N. Cook Statistical evaluation of prognostic versus diagnostic models: beyond the ROC curve. , 2008, Clinical chemistry.

[12]  N. Obuchowski,et al.  ROC curves in clinical chemistry: uses, misuses, and possible solutions. , 2004, Clinical chemistry.

[13]  M. Muselli,et al.  ROC curves are a suitable and flexible tool for the analysis of gene expression profiles , 2003, Cytogenetic and Genome Research.

[14]  Kjetil Søreide,et al.  Evolving molecular classification by genomic and proteomic biomarkers in colorectal cancer: potential implications for the surgical oncologist. , 2009, Surgical oncology.

[15]  Jinsun Lee,et al.  Selective Sentinel Node Plus Additional Non-Sentinel Node Biopsy Based on an FDG-PET/CT Scan in Early Breast Cancer Patients: Single Institutional Experience , 2009, World Journal of Surgery.

[16]  Chen-Tuo Liao,et al.  A non‐inferiority test for diagnostic accuracy based on the paired partial areas under ROC curves , 2008, Statistics in medicine.

[17]  Holly Janes,et al.  The optimal ratio of cases to controls for estimating the classification accuracy of a biomarker. , 2006, Biostatistics.

[18]  M S Pepe,et al.  Phases of biomarker development for early detection of cancer. , 2001, Journal of the National Cancer Institute.

[19]  Xin He,et al.  The Validity of Three-Class Hotelling Trace (3-HT) in Describing Three-Class Task Performance: Comparison of Three-Class Volume Under ROC Surface (VUS) and 3-HT , 2009, IEEE Transactions on Medical Imaging.

[20]  T. Osler,et al.  TMPM–ICD9: A Trauma Mortality Prediction Model Based on ICD-9-CM Codes , 2009, Annals of surgery.

[21]  Ariel Linden Measuring diagnostic and predictive accuracy in disease management: an introduction to receiver operating characteristic (ROC) analysis. , 2006, Journal of evaluation in clinical practice.

[22]  Derek Stephens,et al.  Prospective validation of the pediatric appendicitis score. , 2008, The Journal of pediatrics.

[23]  F. Buntinx,et al.  Meta-analysis of ROC Curves , 2000, Medical decision making : an international journal of the Society for Medical Decision Making.

[24]  T. Cai,et al.  Combining Predictors for Classification Using the Area under the Receiver Operating Characteristic Curve , 2006, Biometrics.

[25]  Douglas G Altman,et al.  Reporting recommendations for tumor marker prognostic studies. , 2005, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[26]  R. Prentice Surrogate and mediating endpoints: current status and future directions. , 2009, Journal of the National Cancer Institute.

[27]  David Gur,et al.  Selection of a rating scale in receiver operating characteristic studies: some remaining issues. , 2008, Academic radiology.

[28]  Jen‐pei Liu,et al.  Tests of equivalence and non‐inferiority for diagnostic accuracy based on the paired areas under ROC curves , 2006, Statistics in medicine.

[29]  Patrick Bossuyt,et al.  Systematic Reviews of Diagnostic Test Accuracy , 2008, Annals of Internal Medicine.

[30]  J. Ware The limitations of risk factors as prognostic tools. , 2006, The New England journal of medicine.

[31]  Jon C Ison,et al.  ROCPLOT: a generic software tool for ROC analysis and the validation of predictive methods. , 2005, Applied bioinformatics.

[32]  Arshed A Quyyumi,et al.  Surrogate Markers for Cardiovascular Disease: Functional Markers , 2004, Circulation.

[33]  J. Musial,et al.  Clinical significance of antiphospholipid protein antibodies. Receiver operating characteristics plot analysis. , 2003, Journal of Rheumatology.

[34]  D. Hicks,et al.  Molecular Classification of Breast Carcinomas by Immunohistochemical Analysis: Are We Ready? , 2009, Diagnostic molecular pathology : the American journal of surgical pathology, part B.

[35]  Problems with using biomarkers as surrogate end points for cancer: a cautionary tale. , 2005, Recent results in cancer research. Fortschritte der Krebsforschung. Progres dans les recherches sur le cancer.

[36]  David Moher,et al.  Towards complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative. Standards for Reporting of Diagnostic Accuracy. , 2003, Clinical chemistry.

[37]  M. Qadan,et al.  Assessment of publication bias for the surgeon scientist , 2008, The British journal of surgery.

[38]  S D Walter,et al.  Studies reporting ROC curves of diagnostic and prediction data can be incorporated into meta-analyses using corresponding odds ratios. , 2007, Journal of clinical epidemiology.

[39]  H. Brenner,et al.  Potential for Colorectal Cancer Prevention of Sigmoidoscopy Versus Colonoscopy: Population-Based Case Control Study , 2007, Cancer Epidemiology Biomarkers & Prevention.

[40]  J. Swets Detection theory and psychophysics: A review , 1961, Psychometrika.

[41]  N. Obuchowski New methodological tools for multiple-reader ROC studies. , 2007, Radiology.

[42]  Lorenzo L. Pesce,et al.  Reliable and computationally efficient maximum-likelihood estimation of "proper" binormal ROC curves. , 2007, Academic radiology.

[43]  J R Beck,et al.  Decision-making Studies in Patient Management , 1991, Medical decision making : an international journal of the Society for Medical Decision Making.

[44]  S. Baker The central role of receiver operating characteristic (ROC) curves in evaluating tests for the early detection of cancer. , 2005, Journal of the National Cancer Institute.

[45]  W S A Smellie When is “abnormal” abnormal? Dealing with the slightly out of range laboratory result , 2006, Journal of Clinical Pathology.

[46]  M. Altschuler,et al.  Using the receiver operating characteristic curve to select pretreatment and pathologic predictors for early and late postprostatectomy PSA failure. , 2001, Urology.

[47]  H. Körner,et al.  Diagnostic Accuracy of Serum-Carcinoembryonic Antigen in Recurrent Colorectal Cancer: A Receiver Operating Characteristic Curve Analysis , 2007, Annals of Surgical Oncology.

[48]  D. Becker,et al.  Biomarkers for prediction of cardiovascular events. , 2007, The New England journal of medicine.

[49]  Xin He,et al.  The Meaning and Use of the Volume Under a Three-Class ROC Surface (VUS) , 2008, IEEE Transactions on Medical Imaging.

[50]  J. Weinstein,et al.  Biomarkers in Cancer Staging, Prognosis and Treatment Selection , 2005, Nature Reviews Cancer.

[51]  K. Zou,et al.  Smooth non-parametric receiver operating characteristic (ROC) curves for continuous diagnostic tests. , 1997, Statistics in medicine.

[52]  Nils P Johnson Advantages to transforming the receiver operating characteristic (ROC) curve into likelihood ratio co-ordinates. , 2004, Statistics in medicine.

[53]  B. Dahlöf,et al.  Surrogate markers for cardiovascular disease: structural markers. , 2004, Circulation.

[54]  Holly Janes,et al.  Practice of Epidemiology Adjusting for Covariates in Studies of Diagnostic, Screening, or Prognostic Markers: an Old Concept in a New Setting , 2022 .

[55]  C. Metz,et al.  "Proper" Binormal ROC Curves: Theory and Maximum-Likelihood Estimation. , 1999, Journal of mathematical psychology.

[56]  Margaret S Pepe,et al.  Gauging the performance of SNPs, biomarkers, and clinical factors for predicting risk of breast cancer. , 2008, Journal of the National Cancer Institute.

[57]  Margaret Sullivan Pepe,et al.  Insights into latent class analysis of diagnostic test performance. , 2007, Biostatistics.

[58]  Margaret Sullivan Pepe,et al.  Distribution-free ROC analysis using binary regression techniques. , 2002, Biostatistics.

[59]  A. Wu,et al.  Influence of imprecision on ROC curve analysis for cardiac markers. , 2006, Clinical chemistry.

[60]  D. Ransohoff Bias as a threat to the validity of cancer molecular-marker research , 2005, Nature reviews. Cancer.

[61]  T. Walley,et al.  Evaluating laboratory diagnostic tests , 2008, BMJ : British Medical Journal.

[62]  H. Sitter,et al.  Diagnostic score in appendicitis. Validation of a diagnostic score (Eskelinen score) in patients in whom acute appendicitis is suspected. , 2004, Langenbeck's archives of surgery.

[63]  Lesly A. Dossett,et al.  Early prediction of massive transfusion in trauma: simple as ABC (assessment of blood consumption)? , 2009, The Journal of trauma.

[64]  J. Bosch,et al.  Prediction of 30-day mortality after endovascular repair or open surgery in patients with ruptured abdominal aortic aneurysms. , 2009, Journal of vascular surgery.

[65]  Tomasz Burzykowski,et al.  Surrogate endpoints: wishful thinking or reality? , 2008, Statistical methods in medical research.

[66]  Johannes B Reitsma,et al.  Evidence of bias and variation in diagnostic accuracy studies , 2006, Canadian Medical Association Journal.

[67]  David Moher,et al.  Towards complete and accurate reporting of studies of diagnostic accuracy: The STARD Initiative. , 2003, Radiology.

[68]  Ying Huang,et al.  Evaluating the ROC performance of markers for future events , 2008, Lifetime data analysis.

[69]  R. Steele,et al.  Selecting immunohistochemical cut-off scores for novel biomarkers of progression and survival in colorectal cancer , 2006, Journal of Clinical Pathology.

[70]  P. Bossuyt,et al.  Sources of Variation and Bias in Studies of Diagnostic Accuracy , 2004, Annals of Internal Medicine.

[71]  K. Zou,et al.  Receiver-Operating Characteristic Analysis for Evaluating Diagnostic Tests and Predictive Models , 2007, Circulation.

[72]  L B Lusted,et al.  Signal detectability and medical decision-making. , 1971, Science.

[73]  T. Cai,et al.  Semi-parametric estimation of the binormal ROC curve for a continuous diagnostic test. , 2004, Biostatistics.

[74]  D. Ransohoff Rules of evidence for cancer molecular-marker discovery and validation , 2004, Nature Reviews Cancer.

[75]  I. Loftus,et al.  Objective risk-scoring systems for repair of abdominal aortic aneurysms: applicability in endovascular repair? , 2008, European journal of vascular and endovascular surgery : the official journal of the European Society for Vascular Surgery.

[76]  C. Dejong,et al.  Feasibility of randomized controlled trials in liver surgery using surgery‐related mortality or morbidity as endpoint , 2009, The British journal of surgery.

[77]  B Reiser,et al.  Measuring the effectiveness of diagnostic markers in the presence of measurement error through the use of ROC curves. , 2000, Statistics in medicine.

[78]  M. Ernst,et al.  Validation of a Nomogram to Predict the Risk of Nonsentinel Lymph Node Metastases in Breast Cancer Patients with a Positive Sentinel Node Biopsy: Validation of the MSKCC Breast Nomogram , 2009, Annals of Surgical Oncology.

[79]  F. V. Lente,et al.  Enzymatic markers of gallstone-induced pancreatitis identified by ROC curve analysis, discriminant analysis, logistic regression, likelihood ratios, and information theory. , 1995, Clinical chemistry.

[80]  H. Körner,et al.  Metachronous cancer development in patients with sporadic colorectal adenomas—multivariate risk model with independent and combined value of hTERT and survivin , 2008, International Journal of Colorectal Disease.

[81]  Xin He,et al.  ROC, LROC, FROC, AFROC: an alphabet soup. , 2009, Journal of the American College of Radiology : JACR.

[82]  Kjetil Søreide,et al.  Receiver-operating characteristic curve analysis in diagnostic, prognostic and predictive biomarker research , 2008, Journal of Clinical Pathology.

[83]  Xin He,et al.  Three-class ROC analysis-a decision theoretic approach under the ideal observer framework , 2006, IEEE Transactions on Medical Imaging.

[84]  Thomas J. Wang,et al.  The search for new cardiovascular biomarkers , 2008, Nature.

[85]  D G Altman,et al.  Statistics Notes: Diagnostic tests 3: receiver operating characteristic plots , 1994, BMJ.

[86]  R. Yeh,et al.  Identification of biomarkers of adrenocortical carcinoma using genomewide gene expression profiling. , 2008, Archives of surgery.

[87]  E. Hawk,et al.  Perspectives on surrogate end points in the development of drugs that reduce the risk of cancer. , 2000, Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology.

[88]  Robert D. Clark,et al.  Managing bias in ROC curves , 2008, J. Comput. Aided Mol. Des..

[89]  D. Normolle,et al.  Biomarkers for Cancer Risk, Early Detection, and Prognosis: The Validation Conundrum , 2007, Cancer Epidemiology Biomarkers & Prevention.

[90]  Stephen L Hillis,et al.  Recent developments in the Dorfman-Berbaum-Metz procedure for multireader ROC study analysis. , 2008, Academic radiology.

[91]  Patrick M M Bossuyt,et al.  Diagnostic test accuracy may vary with prevalence: implications for evidence-based diagnosis. , 2009, Journal of clinical epidemiology.

[92]  J. Hanley Receiver operating characteristic (ROC) methodology: the state of the art. , 1989, Critical reviews in diagnostic imaging.

[93]  M. Zweig ROC plots display test accuracy, but are still limited by the study design. , 1993, Clinical chemistry.

[94]  W S A Smellie,et al.  What is a significant difference between sequential laboratory results?Calf muscle pain can indicate localised vasculitis , 2007, Journal of Clinical Pathology.

[95]  M. Zweig,et al.  Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. , 1993, Clinical chemistry.

[96]  D Faraggi,et al.  The effect of random measurement error on receiver operating characteristic (ROC) curves. , 2000, Statistics in medicine.

[97]  Yingye Zheng,et al.  Semiparametric estimation of time-dependent ROC curves for longitudinal marker data. , 2004, Biostatistics.

[98]  U. Guller,et al.  Caveats in the interpretation of the surgical literature , 2008, The British journal of surgery.

[99]  Hao Liu,et al.  On the analysis of glycomics mass spectrometry data via the regularized area under the ROC curve , 2007, BMC Bioinformatics.

[100]  Klaus Jung,et al.  Comparison of eight computer programs for receiver-operating characteristic analysis. , 2003, Clinical chemistry.

[101]  M S Pepe,et al.  Evaluating technologies for classification and prediction in medicine , 2005, Statistics in medicine.

[102]  Xin He,et al.  Three-Class ROC Analysis—The Equal Error Utility Assumption and the Optimality of Three-Class ROC Surface Using the Ideal Observer , 2006, IEEE Transactions on Medical Imaging.