论文信息 - Ranking Instances by Maximizing the Area under ROC Curve

Ranking Instances by Maximizing the Area under ROC Curve

In recent years, the problem of learning a real-valued function that induces a ranking over an instance space has gained importance in machine learning literature. Here, we propose a supervised algorithm that learns a ranking function, called ranking instances by maximizing the area under the ROC curve (RIMARC). Since the area under the ROC curve (AUC) is a widely accepted performance measure for evaluating the quality of ranking, the algorithm aims to maximize the AUC value directly. For a single categorical feature, we show the necessary and sufficient condition that any ranking function must satisfy to achieve the maximum AUC. We also sketch a method to discretize a continuous feature in a way to reach the maximum AUC as well. RIMARC uses a heuristic to extend this maximization to all features of a data set. The ranking function learned by the RIMARC algorithm is in a human-readable form; therefore, it provides valuable information to domain experts for decision making. Performance of RIMARC is evaluated on many real-life data sets by using different state-of-the-art algorithms. Evaluations of the AUC metric show that RIMARC achieves significantly better performance compared to other similar methods.

H. Altay Güvenir | Murat Kurtcephe | H. Altay Güvenir | Murat Kurtcephe

[1] Szymon Jaroszewicz,et al. Efficient AUC Optimization for Classification , 2007, PKDD.

[2] Peter A. Flach,et al. Learning Decision Trees Using the Area Under the ROC Curve , 2002, ICML.

[3] Tom Fawcett,et al. Adaptive Fraud Detection , 1997, Data Mining and Knowledge Discovery.

[4] Chunxia Zhao,et al. AUC maximization linear classifier based on active learning and its application , 2010, Neurocomputing.

[5] Peter A. Flach,et al. Repairing Concavities in ROC Curves , 2005, IJCAI.

[6] Gábor Lugosi,et al. Ranking and Scoring Using Empirical Risk Minimization , 2005, COLT.

[7] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.

[8] Robert Tibshirani,et al. An Introduction to the Bootstrap , 1994 .

[9] Henrik Boström,et al. Maximizing the Area under the ROC Curve using Incremental Reduced Error Pruning , 2005, ICML 2005.

[10] Charles X. Ling,et al. Using AUC and accuracy in evaluating learning algorithms , 2005, IEEE Transactions on Knowledge and Data Engineering.

[11] Ian H. Witten,et al. The WEKA data mining software: an update , 2009, SKDD.

[12] Chih-Jen Lin,et al. Probability Estimates for Multi-class Classification by Pairwise Coupling , 2003, J. Mach. Learn. Res..

[13] Ron Kohavi,et al. The Case against Accuracy Estimation for Comparing Induction Algorithms , 1998, ICML.

[14] H. Tunstall-Pedoe,et al. Estimation of ten-year risk of fatal cardiovascular disease in Europe: the SCORE project. , 2003, European heart journal.

[15] Kevin Dowd,et al. After VAR: The Theory, Estimation, and Insurance Applications of Quantile-Based Risk Measures , 2006 .

[16] F. Wilcoxon. Individual Comparisons by Ranking Methods , 1945 .

[17] Robert C. Holte,et al. Very Simple Classification Rules Perform Well on Most Commonly Used Datasets , 1993, Machine Learning.

[18] Michael C. Mozer,et al. Optimizing Classifier Performance Via the Wilcoxon-Mann-Whitney Statistic , 2003, ICML 2003.

[19] J W CONN,et al. Adrenal Factors in Hypertension , 1958, Circulation.

[20] Tom Fawcett,et al. Analysis and Visualization of Classifier Performance: Comparison under Imprecise Class and Cost Distributions , 1997, KDD.

[21] Ulf Brefeld,et al. {AUC} maximizing support vector learning , 2005 .

[22] Ron Kohavi,et al. Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[23] Alain Rakotomamonjy,et al. Optimizing Area Under Roc Curve with SVMs , 2004, ROCAI.

[24] S. Rachev. Handbook of heavy tailed distributions in finance , 2003 .

[25] Peter A. Flach,et al. ROCCER: A ROC convex hull rule learning algorithm , 2004 .

[26] Yoram Singer,et al. Learning to Order Things , 1997, NIPS.

[27] H. Altay Güvenir,et al. A Discretization Method Based on Maximizing the Area under Receiver Operating Characteristic Curve , 2013, Int. J. Pattern Recognit. Artif. Intell..

[28] Tom Fawcett,et al. An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[29] Bhavani Raskutti,et al. Optimising area under the ROC curve using gradient descent , 2004, ICML.

[30] William Nick Street,et al. Learning to Rank by Maximizing AUC with Linear Programming , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[31] Charles X. Ling,et al. Toward Bayesian Classifiers with Accurate Probabilities , 2002, PAKDD.

[32] M. Pencina,et al. General Cardiovascular Risk Profile for Use in Primary Care: The Framingham Heart Study , 2008, Circulation.

[33] Kar-Ann Toh,et al. Maximizing area under ROC curve for biometric scores fusion , 2008, Pattern Recognit..

[34] Thorsten Joachims,et al. A support vector method for multivariate performance measures , 2005, ICML.

[35] J. Hanley,et al. The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[36] Dan Roth,et al. Learnability of Bipartite Ranking Functions , 2005, COLT.

[37] Mehryar Mohri,et al. AUC Optimization vs. Error Rate Minimization , 2003, NIPS.

[38] H. Altay Güvenir,et al. A Discretization Method based on Maximizing the Area Under ROC Curve , 2010 .

[39] C. Marroccoa,et al. Maximizing the area under the ROC curve by pairwise feature combination , 2008 .

[40] Weiguo Fan,et al. Discovery of context-specific ranking functions for effective information retrieval using genetic programming , 2004, IEEE Transactions on Knowledge and Data Engineering.

[41] Pedro M. Domingos. MetaCost: a general method for making classifiers cost-sensitive , 1999, KDD '99.

[42] Dan Roth,et al. Generalization Bounds for the Area Under the ROC Curve , 2005, J. Mach. Learn. Res..

[43] Yoram Singer,et al. An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[44] Michael C. Mozer,et al. Prodding the ROC Curve: Constrained Optimization of Classifier Performance , 2001, NIPS.

[45] Michèle Sebag,et al. ROC-Based Evolutionary Learning: Application to Medical Data Mining , 2003, Artificial Evolution.

[46] Claudio Marrocco,et al. Exploiting AUC for optimal linear combinations of dichotomizers , 2006, Pattern Recognit. Lett..

[47] Robert P. W. Duin,et al. Linear model combining by optimizing the Area under the ROC curve , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[48] Andrew P. Bradley,et al. The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[49] Tom Fawcett,et al. Using rule sets to maximize ROC performance , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[50] Fengxia Wang,et al. Cost-Sensitive Support Vector Ranking for Information Retrieval , 2010, J. Convergence Inf. Technol..

[51] Xue-wen Chen,et al. Combating the Small Sample Class Imbalance Problem Using Feature Selection , 2010, IEEE Transactions on Knowledge and Data Engineering.

[52] H. A. Guvenir,et al. Classification by Feature Partitioning , 1996, Machine Learning.