Multiclass ROC Analysis

Receiver operating characteristic (ROC) curves have become a common analysis tool for evaluating forecast discrimination: the ability of a forecast system to distinguish between events and nonevents. As is implicit in that statement, application of the ROC curve is limited to forecasts involving only two possible outcomes, such as rain and no rain. However, many forecast scenarios exist for which there are multiple possible outcomes, such as rain, snow, and freezing rain. An extension of the ROC curve to multiclass forecast problems is explored. The full extension involves high-dimensional hypersurfaces that cannot be visualized and that present other problems. Therefore, several different approximations to the full extension are introduced using both artificial and actual forecast datasets. These approximations range from sets of simple two-class ROC curves to sets of three-dimensional ROC surfaces. No single approximation is superior for all forecast problems; thus, the specific aims in evaluating the forecast must be considered.

[1]  M. Binder,et al.  Comparing Three-class Diagnostic Tests by Three-way ROC Analysis , 2000, Medical decision making : an international journal of the Society for Medical Decision Making.

[2]  D. Mossman Three-way ROCs , 1999, Medical decision making : an international journal of the Society for Medical Decision Making.

[3]  Xin He,et al.  Three-Class ROC Analysis—The Equal Error Utility Assumption and the Optimality of Three-Class ROC Surface Using the Ideal Observer , 2006, IEEE Transactions on Medical Imaging.

[4]  N. Graham,et al.  Areas beneath the relative operating characteristics (ROC) and relative operating levels (ROL) curves: Statistical significance and interpretation , 2002 .

[5]  Scurfield Multiple-Event Forced-Choice Tasks in the Theory of Signal Detectability , 1996, Journal of mathematical psychology.

[6]  C. Doswell,et al.  On Summary Measures of Skill in Rare Event Forecasting Based on Contingency Tables , 1990 .

[7]  Maryellen L. Giger,et al.  Ideal observer approximation using Bayesian classification neural networks , 2001, IEEE Transactions on Medical Imaging.

[8]  Tom Fawcett,et al.  ROC Graphs: Notes and Practical Considerations for Researchers , 2007 .

[9]  I. Jolliffe,et al.  Forecast verification : a practitioner's guide in atmospheric science , 2011 .

[10]  A. H. Murphy,et al.  A General Framework for Forecast Verification , 1987 .

[11]  C. Yiannoutsos,et al.  Ordered multiple‐class ROC analysis with continuous measurements , 2004, Statistics in medicine.

[12]  øöö Blockinøø Well-Trained PETs : Improving Probability Estimation , 2000 .

[13]  Michael E. Baldwin,et al.  Short-Range Ensemble Forecasts of Precipitation Type , 2005 .

[14]  Charles E. Metz,et al.  Restrictions on the three-class ideal observer's decision boundary lines , 2005, IEEE Transactions on Medical Imaging.

[15]  Tom Fawcett,et al.  Robust Classification Systems for Imprecise Environments , 1998, AAAI/IAAI.

[16]  Scurfield,et al.  Generalization of the Theory of Signal Detectability to n-Event m-Dimensional Forced-Choice Tasks. , 1998, Journal of mathematical psychology.

[17]  Robert E. Livezey,et al.  THE FIRST DECADE OF LONG-LEAD U.S. SEASONAL FORECASTS Insights from a Skill Analysis , 2008 .

[18]  Robert M. Nishikawa,et al.  The hypervolume under the ROC hypersurface of "Near-Guessing" and "Near-Perfect" observers in N-class classification tasks , 2005, IEEE Transactions on Medical Imaging.

[19]  Tom Fawcett,et al.  Robust Classification for Imprecise Environments , 2000, Machine Learning.

[20]  Francis W. Zwiers,et al.  On the ROC score of probability forecasts , 2003 .

[21]  J. C. Thompson,et al.  THE ECONOMIC UTILITY OF WEATHER FORECASTS , 1955 .

[22]  David J. Hand,et al.  A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems , 2001, Machine Learning.

[23]  Charles E. Metz,et al.  Optimization of an ROC hypersurface constructed only from an observer's within-class sensitivities , 2006, SPIE Medical Imaging.

[24]  D. Bamber The area above the ordinal dominance graph and the area below the receiver operating characteristic graph , 1975 .

[25]  Charles E. Metz,et al.  Review of several proposed three-class classification decision rules and their relation to the ideal observer decision rule , 2005, SPIE Medical Imaging.

[26]  Daniel S. Wilks,et al.  Diagnostic Verification of the Climate Prediction Center Long-Lead Outlooks, 1995-98 , 2000 .

[27]  Ian T. Jolliffe,et al.  FORECASTERS' FORUM Comments on "Discussion of Verification Concepts in Forecast Verification: A Practitioner's Guide in Atmospheric Science" , 2005 .

[28]  Lewis O. Harvey,et al.  The Application of Signal Detection Theory to Weather Forecasting Behavior , 1992 .

[29]  W. Briggs Statistical Methods in the Atmospheric Sciences , 2007 .

[30]  D. Richardson Skill and relative economic value of the ECMWF ensemble prediction system , 2000 .

[31]  Nicholas E. Graham,et al.  Conditional Probabilities, Relative Operating Characteristics, and Relative Operating Levels , 1999 .

[32]  Peter A. Flach,et al.  Improving Accuracy and Cost of Two-class and Multi-class Probabilistic Classifiers Using ROC Curves , 2003, ICML.

[33]  C. Marzban The ROC Curve and the Area under It as Performance Measures , 2004 .

[34]  Xin He,et al.  Three-class ROC analysis-a decision theoretic approach under the ideal observer framework , 2006, IEEE Transactions on Medical Imaging.

[35]  Matthew A. Kupinski,et al.  Ideal observers and optimal ROC hypersurfaces in N-class classification , 2004, IEEE Transactions on Medical Imaging.

[36]  A. H. Murphy,et al.  Scalar and Vector Partitions of the Probability Score: Part I. Two-State Situation , 1972 .