Applications of ROC analysis in medical research: recent developments and future directions.

With the growing focus on comparative effectiveness research and personalized medicine, receiver-operating characteristic analysis can continue to play an important role in health care decision making. Specific applications of receiver-operating characteristic analysis include predictive model assessment and validation, biomarker diagnostics, responder analysis in patient-reported outcomes, and comparison of alternative treatment options. The authors present a survey of the potential applications of the method and briefly review several relevant extensions. Given the level of attention paid to biomarker validation, personalized medicine and comparative effectiveness research, it is highly likely that the receiver-operating characteristic analysis will remain an important visual and analytic tool for medical research and evidence-based medicine in the foreseeable future.

[1]  Diana L Miglioretti,et al.  Statistical approaches for modeling radiologists' interpretive performance. , 2009, Academic radiology.

[2]  N A Obuchowski,et al.  Sample size calculations in studies of test accuracy , 1998, Statistical methods in medical research.

[3]  A James O'Malley,et al.  Bayesian multivariate hierarchical transformation models for ROC analysis , 2006, Statistics in medicine.

[4]  C. Metz ROC Methodology in Radiologic Imaging , 1986, Investigative radiology.

[5]  Nancy A Obuchowski,et al.  Clinical evaluation of diagnostic tests. , 2005, AJR. American journal of roentgenology.

[6]  G. Gensini,et al.  Systematic reviews of diagnostic test accuracy and the Cochrane collaboration , 2009, Internal and emergency medicine.

[7]  David Gur,et al.  Area under the Free‐Response ROC Curve (FROC) and a Related Summary Index , 2009, Biometrics.

[8]  Dev P Chakraborty Prediction accuracy of a sample-size estimation method for ROC studies. , 2010, Academic radiology.

[9]  C. Terwee,et al.  Minimally important change determined by a visual method integrating an anchor-based and a distribution-based approach , 2006, Quality of Life Research.

[10]  Kent A. Spackman,et al.  Signal Detection Theory: Valuable Tools for Evaluating Inductive Learning , 1989, ML.

[11]  Jui G. Bhagwat,et al.  Magnetic resonance and the human brain: anatomy, function and metabolism , 2006, Cellular and Molecular Life Sciences CMLS.

[12]  K. Zou,et al.  Statistical validation based on parametric receiver operating characteristic analysis of continuous classification data. , 2003, Academic radiology.

[13]  Jun S. Liu,et al.  Linear Combinations of Multiple Diagnostic Markers , 1993 .

[14]  D. Shapiro,et al.  The interpretation of diagnostic tests , 1999, Statistical methods in medical research.

[15]  N A Obuchowski,et al.  Multireader receiver operating characteristic studies: a comparison of study designs. , 1995, Academic radiology.

[16]  K. Zou,et al.  Three validation metrics for automated probabilistic image segmentation of brain tumours , 2004, Statistics in medicine.

[17]  C A Gatsonis,et al.  Regression methods for meta-analysis of diagnostic test data. , 1995, Academic radiology.

[18]  R. F. Wagner,et al.  Components-of-variance models and multiple-bootstrap experiments: an alternative method for random-effects, receiver operating characteristic analysis. , 2000, Academic radiology.

[19]  M. Zweig,et al.  Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. , 1993, Clinical chemistry.

[20]  A. Algra,et al.  Predicting outcome after acute basilar artery occlusion based on admission characteristics , 2012, Neurology.

[21]  R. F. Wagner,et al.  Assessment of medical imaging systems and computer aids: a tutorial review. , 2007, Academic radiology.

[22]  H. H. Song,et al.  Analysis of correlated ROC areas in diagnostic testing. , 1997, Biometrics.

[23]  Brian M Alexander,et al.  Evidence based diagnosis: does the language reflect the theory? , 2006, BMJ : British Medical Journal.

[24]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[25]  B. Reiser,et al.  Estimation of the Youden Index and its Associated Cutoff Point , 2005, Biometrical journal. Biometrische Zeitschrift.

[26]  R. F. Wagner,et al.  Toward consensus on quantitative assessment of medical imaging systems. , 1995, Medical physics.

[27]  A. Jaffe,et al.  Diagnosis of perioperative myocardial infarction with measurement of cardiac troponin I. , 1994, The New England journal of medicine.

[28]  Charles E Metz,et al.  ROC analysis in medical imaging: a tutorial review of the literature , 2008, Radiological physics and technology.

[29]  R. Katz,et al.  Biomarkers and surrogate markers: An FDA perspective , 2004, NeuroRX.

[30]  David J. Hand,et al.  ROC Curves for Continuous Data , 2009 .

[31]  Xiao-Hua Zhou,et al.  Statistical Methods in Diagnostic Medicine , 2002 .

[32]  J R Fielding,et al.  Bayesian regression methodology for estimating a receiver operating characteristic curve with two radiologic applications: prostate biopsy and spiral CT of ureteral stones. , 2001, Academic radiology.

[33]  N. Perkins,et al.  The inconsistency of "optimal" cutpoints obtained using two criteria based on the receiver operating characteristic curve. , 2006, American journal of epidemiology.

[34]  D Faraggi,et al.  The effect of random measurement error on receiver operating characteristic (ROC) curves. , 2000, Statistics in medicine.

[35]  John Eng,et al.  Receiver operating characteristic analysis: a primer. , 2005, Academic radiology.

[36]  Nancy A Obuchowski,et al.  Estimating and comparing diagnostic tests' accuracy when the gold standard is not binary. , 2005, Academic radiology.

[37]  John Eng,et al.  Sample size estimation: a glimpse beyond simple formulas. , 2004, Radiology.

[38]  R. F. Wagner,et al.  Multireader, multicase receiver operating characteristic analysis: an empirical comparison of five methods. , 2004, Academic radiology.

[39]  N. Obuchowski,et al.  ROC curves in clinical chemistry: uses, misuses, and possible solutions. , 2004, Clinical chemistry.

[40]  K. Berbaum,et al.  Proper receiver operating characteristic analysis: the bigamma model. , 1997, Academic radiology.

[41]  Roger Mundry,et al.  Stepwise Model Fitting and Statistical Inference: Turning Noise into Signal Pollution , 2008, The American Naturalist.

[42]  Charles E Metz,et al.  Receiver operating characteristic analysis: a tool for the quantitative evaluation of observer performance and imaging systems. , 2006, Journal of the American College of Radiology : JACR.

[43]  C M Rutter,et al.  A hierarchical regression approach to meta‐analysis of diagnostic test accuracy evaluations , 2001, Statistics in medicine.

[44]  M. Giger,et al.  Multimodality computer-aided breast cancer diagnosis with FFDM and DCE-MRI. , 2010, Academic radiology.

[45]  J. Speight,et al.  FDA guidance on patient reported outcomes , 2010, BMJ : British Medical Journal.

[46]  Karen Drukker,et al.  Semiparametric estimation of the relationship between ROC operating points and the test-result scale: application to the proper binormal model. , 2011, Academic radiology.

[47]  Mats Lundqvist,et al.  Comparison of radiologist performance with photon-counting full-field digital mammography to conventional full-field digital mammography. , 2012, Academic radiology.

[48]  William M. Wells,et al.  Validation of image segmentation by estimating rater bias and variance , 2008, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[49]  Hany Kashani,et al.  Diagnostic performance of a prototype dual-energy chest imaging system ROC analysis. , 2010, Academic radiology.

[50]  William M. Wells,et al.  Simultaneous truth and performance level estimation (STAPLE): an algorithm for the validation of image segmentation , 2004, IEEE Transactions on Medical Imaging.

[51]  J. Farrar,et al.  Use of the cumulative proportion of responders analysis graph to present pain data over a range of cut-off points: making clinical trial data more understandable. , 2006, Journal of pain and symptom management.

[52]  K. Zou,et al.  Receiver-Operating Characteristic Analysis for Evaluating Diagnostic Tests and Predictive Models , 2007, Circulation.

[53]  Model Performance Measures for Expected Utility Maximizing Investors , 2003 .

[54]  Howard Rockette,et al.  Statistical Evaluation of Diagnostic Performance: Topics in Roc Analysis , 2011 .

[55]  Nobuhiro Oda,et al.  Usefulness of computerized method for lung nodule detection in digital chest radiographs using temporal subtraction images. , 2011, Academic radiology.

[56]  R. F. Wagner,et al.  Assessment of medical imaging and computer-assist systems: lessons from recent experience. , 2002, Academic radiology.

[57]  J. Farrar,et al.  Clinical importance of changes in chronic pain intensity measured on an 11-point numerical pain rating scale , 2001, PAIN.

[58]  C E Phelps,et al.  Estimating Diagnostic Test Accuracy Using a "Fuzzy Gold Standard" , 1995, Medical decision making : an international journal of the Society for Medical Decision Making.

[59]  Berkman Sahiner,et al.  Hypothesis testing in noninferiority and equivalence MRMC ROC studies. , 2012, Academic radiology.

[60]  Stephen L Hillis,et al.  Recent developments in the Dorfman-Berbaum-Metz procedure for multireader ROC study analysis. , 2008, Academic radiology.

[61]  N A Obuchowski,et al.  Multireader, multimodality receiver operating characteristic curve studies: hypothesis testing and sample size estimation using an analysis of variance approach with dependent observations. , 1995, Academic radiology.

[62]  D. Chakraborty New developments in observer performance methodology in medical imaging. , 2011, Seminars in nuclear medicine.

[63]  N. Obuchowski,et al.  Hypothesis testing of diagnostic accuracy for multiple readers and multiple tests: An anova approach with dependent observations , 1995 .

[64]  Mithat Gonen,et al.  Analyzing Receiver Operating Characteristic Curves with SAS , 2007 .

[65]  N. Hylton Dynamic contrast-enhanced magnetic resonance imaging as an imaging biomarker. , 2006, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[66]  C. Hollenbeak,et al.  Predictive models for diabetes patients in Medicaid. , 2011, Population health management.

[67]  H. Ishwaran,et al.  A general class of hierarchical ordinal regression models with applications to correlated roc analysis , 2000 .

[68]  Patrick Bossuyt,et al.  Systematic Reviews of Diagnostic Test Accuracy , 2008, Annals of Internal Medicine.

[69]  D. Chakraborty How many readers and cases does one need to conduct an ROC study? , 2011, Academic radiology.

[70]  X H Zhou,et al.  Correcting for verification bias in studies of a diagnostic test's accuracy , 1998, Statistical methods in medical research.

[71]  Jaroslaw Harezlak,et al.  Comparison of bandwidth selection methods for kernel smoothing of ROC curves , 2002, Statistics in medicine.

[72]  Nancy A. Obuchowski,et al.  Power estimation for multireader ROC methods an updated and unified approach. , 2011, Academic radiology.

[73]  Hui Chen,et al.  Neural network ensemble-based computer-aided diagnosis for differentiation of lung nodules on CT images: clinical evaluation. , 2010, Academic radiology.

[74]  Lucila Ohno-Machado,et al.  The use of receiver operating characteristic curves in biomedical informatics , 2005, J. Biomed. Informatics.

[75]  Gary S Collins,et al.  Interpreting diagnostic accuracy studies for patient care , 2012, BMJ : British Medical Journal.

[76]  Mitchell H. Gail,et al.  A family of nonparametric statistics for comparing diagnostic markers with paired or unpaired data , 1989 .

[77]  M. Pomper,et al.  Diagnostic accuracy using diffusion tensor imaging in the diagnosis of ALS: a meta-analysis. , 2012, Academic radiology.

[78]  Nancy A Obuchowski,et al.  An ROC‐type measure of diagnostic accuracy when the gold standard is continuous‐scale , 2006, Statistics in medicine.

[79]  D P Chakraborty,et al.  Clinical relevance of the ROC and free-response paradigms for comparing imaging system efficacies. , 2010, Radiation protection dosimetry.

[80]  C. Metz,et al.  Maximum likelihood estimation of receiver operating characteristic (ROC) curves from continuously-distributed data. , 1998, Statistics in medicine.

[81]  Chunling Liu,et al.  A min–max combination of biomarkers to improve diagnostic accuracy , 2011, Statistics in medicine.

[82]  K S Berbaum,et al.  Monte Carlo validation of a multireader method for receiver operating characteristic discrete rating data: factorial experimental design. , 1998, Academic radiology.

[83]  John Chen,et al.  Investigation of optimal use of computer-aided detection systems: the role of the "machine" in decision making process. , 2010, Academic radiology.

[84]  J. Hanley,et al.  A method of comparing the areas under receiver operating characteristic curves derived from the same cases. , 1983, Radiology.

[85]  John A. Swets,et al.  Evaluation of diagnostic systems : methods from signal detection theory , 1982 .

[86]  M. Pepe The Statistical Evaluation of Medical Tests for Classification and Prediction , 2003 .

[87]  Margaret S Pepe,et al.  Using the ROC curve for gauging treatment effect in clinical trials , 2006, Statistics in medicine.

[88]  K. Berbaum,et al.  Receiver operating characteristic rating analysis. Generalization to the population of readers and patients with the jackknife method. , 1992, Investigative radiology.

[89]  Seiji Nakamura,et al.  Clinical significance of combined assessment of the maximum standardized uptake value of F-18 FDG PET with nodal size in the diagnosis of cervical lymph node metastasis of oral squamous cell carcinoma. , 2012, Academic radiology.

[90]  D. Bamber The area above the ordinal dominance graph and the area below the receiver operating characteristic graph , 1975 .

[91]  David B. Holiday,et al.  Comparison of digital with film radiographs for the classification of pneumoconiotic pleural abnormalities. , 2012, Academic radiology.

[92]  Lucila Ohno-Machado,et al.  Supratentorial low-grade glioma resectability: statistical predictive analysis based on anatomic MR features and tumor characteristics. , 2006, Radiology.

[93]  Biao Zhang,et al.  Smooth semiparametric receiver operating characteristic curves for continuous diagnostic tests , 2007, Statistics in medicine.

[94]  Viswanath Devanarayan,et al.  Fit-for-Purpose Method Development and Validation for Successful Biomarker Measurement , 2006, Pharmaceutical Research.

[95]  D. Dorfman,et al.  Maximum likelihood estimation of parameters of signal detection theory—A direct solution , 1968, Psychometrika.

[96]  K S Berbaum,et al.  Multireader, multicase receiver operating characteristic methodology: a bootstrap analysis. , 1995, Academic radiology.