Towards Maximizing the Area Under the ROC Curve for Multi-Class Classification Problems

The Area Under the ROC Curve (AUC) metric has achieved a big success in binary classification problems since they measure the performance of classifiers without making any specific assumptions about the class distribution and misclassification costs. This is desirable because the class distribution and misclassification costs may be unknown during training process or even change in environment. MAUC, the extension of AUC to multi-class problems, has also attracted a lot of attention. However, despite the emergence of approaches for training classifiers with large AUC, little has been done for MAUC. This paper analyzes MAUC in-depth, and reveals that the maximization of MAUC can be achieved by decomposing the multi-class problem into a number of independent sub-problems. These sub-problems are formulated in the form of a "learning to rank" problem, for which well-established methods already exist. Based on the analysis, a method that employs RankBoost algorithm as the sub-problem solver is proposed to achieve classification systems with maximum MAUC. Empirical studies have shown the advantages of the proposed method over other eight relevant methods. Due to the importance of MAUC to multi-class cost-sensitive learning and class imbalanced learning problems, the proposed method is a general technique for both problems. It can also be generalized to accommodate other learning algorithms as the sub-problem solvers.

[1]  Haibo He,et al.  Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[2]  Peter A. Flach,et al.  Learning Decision Trees Using the Area Under the ROC Curve , 2002, ICML.

[3]  Charles Elkan,et al.  The Foundations of Cost-Sensitive Learning , 2001, IJCAI.

[4]  Charles X. Ling,et al.  Using AUC and accuracy in evaluating learning algorithms , 2005, IEEE Transactions on Knowledge and Data Engineering.

[5]  Stan Matwin,et al.  Machine Learning for the Detection of Oil Spills in Satellite Radar Images , 1998, Machine Learning.

[6]  Peter A. Flach,et al.  Improving Accuracy and Cost of Two-class and Multi-class Probabilistic Classifiers Using ROC Curves , 2003, ICML.

[7]  Robert M. Nishikawa,et al.  The hypervolume under the ROC hypersurface of "Near-Guessing" and "Near-Perfect" observers in N-class classification tasks , 2005, IEEE Transactions on Medical Imaging.

[8]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[9]  Anuj Srivastava,et al.  A Bayesian MRF framework for labeling terrain using hyperspectral imaging , 2005, IEEE Transactions on Geoscience and Remote Sensing.

[10]  Bhavani Raskutti,et al.  Optimising area under the ROC curve using gradient descent , 2004, ICML.

[11]  M. Maloof Learning When Data Sets are Imbalanced and When Costs are Unequal and Unknown , 2003 .

[12]  Ramón López de Mántaras,et al.  Robust Bayesian Linear Classifier Ensembles , 2005, ECML.

[13]  David J. Hand,et al.  A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems , 2001, Machine Learning.

[14]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[15]  Tom Fawcett PRIE: a system for generating rulelists to maximize ROC performance , 2008, Data Mining and Knowledge Discovery.

[16]  Xin Yao,et al.  Evolving Neural Networks with Maximum AUC for Imbalanced Data Classification , 2010, HAIS.

[17]  Robert P. W. Duin,et al.  Efficient Multiclass ROC Approximation by Decomposition via Confusion Matrix Perturbation Analysis , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Marko Robnik-Sikonja,et al.  Improving Random Forests , 2004, ECML.

[19]  Michael C. Mozer,et al.  Optimizing Classifier Performance via an Approximation to the Wilcoxon-Mann-Whitney Statistic , 2003, ICML.

[20]  Tom Fawcett,et al.  Robust Classification for Imprecise Environments , 2000, Machine Learning.

[21]  Robert E. Schapire,et al.  On reoptimizing multi-class classifiers , 2008, Machine Learning.

[22]  Pedro M. Domingos,et al.  Tree Induction for Probability-Based Ranking , 2003, Machine Learning.

[23]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[24]  José Hernández-Orallo,et al.  Volume under the ROC Surface for Multi-class Problems , 2003, ECML.

[25]  Zhi-Hua Zhou,et al.  ON MULTI‐CLASS COST‐SENSITIVE LEARNING , 2006, Comput. Intell..