An Extension of the Receiver Operating Characteristic Curve and AUC-Optimal Classification

While most proposed methods for solving classification problems focus on minimization of the classification error rate, we are interested in the receiver operating characteristic (ROC) curve, which provides more information about classification performance than the error rate does. The area under the ROC curve (AUC) is a natural measure for overall assessment of a classifier based on the ROC curve. We discuss a class of concave functions for AUC maximization in which a boosting-type algorithm including RankBoost is considered, and the Bayesian risk consistency and the lower bound of the optimum function are discussed. A procedure derived by maximizing a specific optimum function has high robustness, based on gross error sensitivity. Additionally, we focus on the partial AUC, which is the partial area under the ROC curve. For example, in medical screening, a high true-positive rate to the fixed lower false-positive rate is preferable and thus the partial AUC corresponding to lower false-positive rates is much more important than the remaining AUC. We extend the class of concave optimum functions for partial AUC optimality with the boosting algorithm. We investigated the validity of the proposed method through several experiments with data sets in the UCI repository.

[1]  John D. Lafferty,et al.  Boosting and Maximum Likelihood for Exponential Models , 2001, NIPS.

[2]  Cynthia Rudin,et al.  Margin-based Ranking and an Equivalence between AdaBoost and RankBoost , 2009, J. Mach. Learn. Res..

[3]  Sangsoo Kim,et al.  Partial AUC maximization for essential gene prediction using genetic algorithms , 2013, BMB reports.

[4]  Peter L. Bartlett,et al.  Boosting Algorithms as Gradient Descent in Function Space , 2007 .

[5]  Ziv Bar-Joseph,et al.  Evaluation of different biological data and computational classification methods for use in protein interaction prediction , 2006, Proteins.

[6]  Bhavani Raskutti,et al.  Optimising area under the ROC curve using gradient descent , 2004, ICML.

[7]  Mehryar Mohri,et al.  AUC Optimization vs. Error Rate Minimization , 2003, NIPS.

[8]  A. Atiya,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[9]  S. Baker The central role of receiver operating characteristic (ROC) curves in evaluating tests for the early detection of cancer. , 2005, Journal of the National Cancer Institute.

[10]  Erik P. Cook,et al.  ROC-Based Estimates of Neural-Behavioral Covariations Using Matched Filters , 2014, Neural Computation.

[11]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[12]  Takafumi Kanamori,et al.  Information Geometry of U-Boost and Bregman Divergence , 2004, Neural Computation.

[13]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[14]  Ulf Brefeld,et al.  {AUC} maximizing support vector learning , 2005 .

[15]  M. Schummer,et al.  Selecting Differentially Expressed Genes from Microarray Experiments , 2003, Biometrics.

[16]  M. Pepe,et al.  Combining diagnostic test results to increase accuracy. , 2000, Biostatistics.

[17]  Shinto Eguchi,et al.  A boosting method for maximizing the partial area under the ROC curve , 2010, BMC Bioinformatics.

[18]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[19]  Takafumi Kanamori,et al.  Robust Loss Functions for Boosting , 2007, Neural Computation.

[20]  M. Pencina,et al.  Evaluating the added predictive ability of a new marker: From area under the ROC curve to reclassification and beyond , 2008, Statistics in medicine.

[21]  P. Chaudhuri,et al.  On data depth and distribution-free discriminant analysis using separating surfaces , 2005 .

[22]  Wenping Wang,et al.  Diagnostic performances of various gray-scale, color Doppler, and contrast-enhanced ultrasonography findings in predicting malignant thyroid nodules. , 2014, Thyroid : official journal of the American Thyroid Association.

[23]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[24]  Y. Freund,et al.  Discussion of the Paper \additive Logistic Regression: a Statistical View of Boosting" By , 2000 .

[25]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[26]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[27]  Takafumi Kanamori,et al.  Robust Boosting Algorithm Against Mislabeling in Multiclass Problems , 2008, Neural Computation.

[28]  Shun-ichi Amari,et al.  Methods of information geometry , 2000 .

[29]  M. Pepe The Statistical Evaluation of Medical Tests for Classification and Prediction , 2003 .

[30]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[31]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.