A risk-based comparison of classification systems

Performancemeasures for families of classification system families that rely upon the analysis of receiver operating characteristics (ROCs), such as area under the ROC curve (AUC), often fail to fully address the issue of risk, especially for classification systems involving more than two classes. For the general case, we denote matrices of class prevalence, costs, and class-conditional probabilities, and assume costs are subjectively fixed, acceptable estimates for expected values of class-conditional probabilities exist, and mutual independence between a variable in one such matrix and those of any other matrix. The ROC Risk Functional (RRF), valid for any finite number of classes, has an associated parameter argument, which specifies a member of a family of classification systems, and for which there is an associated classification system minimizing Bayes risk over the family. We typify joint distributions for class prevalences over standard simplices by means of uniform and beta distributions, and create a family of classification systems using actual data, testing independence assumptions under two such class prevalence distributions. Examples are given where the risk is minimized under two different sets of costs.

[1]  C. Metz Basic principles of ROC analysis. , 1978, Seminars in nuclear medicine.

[2]  L. Devroye Non-Uniform Random Variate Generation , 1986 .

[3]  Geoffrey I. Webb,et al.  On the Application of ROC Analysis to Predict Classification Performance Under Varying Class Distributions , 2005, Machine Learning.

[4]  Steven N. Thorsen,et al.  Quantifying the robustness of classification systems , 2006, SPIE Defense + Commercial Sensing.

[5]  Jonathan E. Fieldsend,et al.  Formulation and comparison of multi-class ROC surfaces , 2005 .

[6]  Stephan Dreiseitl,et al.  Training Multiclass Classifiers by Maximizing the Volume Under the ROC Surface , 2007, EUROCAST.

[7]  James P. Egan,et al.  Signal detection theory and ROC analysis , 1975 .

[8]  David G. Stork,et al.  Pattern Classification , 1973 .

[9]  N. L. Johnson,et al.  Continuous Univariate Distributions. , 1995 .

[10]  Lonnie C. Ludeman Random Processes: Filtering, Estimation, and Detection , 2003 .

[11]  José Hernández-Orallo,et al.  Volume under the ROC Surface for Multi-class Problems , 2003, ECML.

[12]  David Stirzaker Probability and Random Variables: A Beginner's Guide , 1999 .

[13]  Steven N. Thorsen,et al.  Comparing Fusors within a Category of Fusors , 2004 .

[14]  Glenn Shafer,et al.  A Mathematical Theory of Evidence , 2020, A Mathematical Theory of Evidence.

[15]  Todd A Alonzo,et al.  ROC Graphs for Assessing the Ability of a Diagnostic Marker to Detect Three Disease Classes with an Umbrella Ordering , 2007, Biometrics.

[16]  Stephan M. Winkler,et al.  Sets of receiver operating characteristic curves and their use in the evaluation of multi-class classification , 2006, GECCO '06.

[17]  S. Révész,et al.  Bernstein's inequality for multivariate polynomials on the standard simplex , 2005 .

[18]  Peter A. Flach The Geometry of ROC Space: Understanding Machine Learning Metrics through ROC Isometrics , 2003, ICML.

[19]  Tom Fawcett,et al.  Robust Classification Systems for Imprecise Environments , 1998, AAAI/IAAI.

[20]  A. N. Kolmogorov,et al.  Foundations of the theory of probability , 1960 .

[21]  Saharon Rosset,et al.  Model selection via the AUC , 2004, ICML.

[22]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[23]  C. Yiannoutsos,et al.  Ordered multiple‐class ROC analysis with continuous measurements , 2004, Statistics in medicine.

[24]  R. T. Cox Probability, frequency and reasonable expectation , 1990 .

[25]  Bernard De Baets,et al.  ROC analysis in ordinal regression learning , 2008, Pattern Recognit. Lett..

[26]  P. Sen,et al.  Nonparametric methods in multivariate analysis , 1974 .

[27]  Chris P. Tsokos,et al.  Mathematical Statistics with Applications , 2009 .

[28]  M. R. Mickey,et al.  Estimation of Error Rates in Discriminant Analysis , 1968 .

[29]  David J. Hand,et al.  A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems , 2001, Machine Learning.

[30]  D. Mossman Three-way ROCs , 1999, Medical decision making : an international journal of the Society for Medical Decision Making.

[31]  Peter A. Flach,et al.  A Response to Webb and Ting’s On the Application of ROC Analysis to Predict Classification Performance Under Varying Class Distributions , 2005, Machine Learning.

[32]  Kellen Petersen August Real Analysis , 2009 .

[33]  Charles X. Ling,et al.  Using AUC and accuracy in evaluating learning algorithms , 2005, IEEE Transactions on Knowledge and Data Engineering.

[34]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[35]  M.E. Oxley,et al.  Multisensor fusion description using category theory , 2004, 2004 IEEE Aerospace Conference Proceedings (IEEE Cat. No.04TH8720).

[36]  J. Wolfowitz,et al.  Non-parametric Statistical Inference , 1949 .

[37]  Ron Kohavi,et al.  The Case against Accuracy Estimation for Comparing Induction Algorithms , 1998, ICML.

[38]  Steven N. Thorsen,et al.  A description of competing fusion systems , 2006, Inf. Fusion.