Ranking Multi-Class Data : Optimality and Pairwise Aggregation

It is the primary purpose of this paper to set the goals of ranking in a multiple-class context rigorously, following in the footsteps of recent results in the bipartite framework. Under specific likelihood ratio monotonicity conditions, optimal solutions for this global learning problem are described in the ordinal situation, i.e. when there exists a natural order on the set of labels. Criteria reflecting ranking performance under these conditions such as the ROC surface and its natural summary, the volume under the ROC surface (VUS), are next considered as targets for empirical optimization. Whereas plug-in techniques or the Empirical Risk Maximization principle can be then easily extended to the ordinal multi-class setting, reducing the K-partite ranking task to the solving of a collection of bipartite ranking problems, following in the footsteps of the pairwise comparison approach in classification, is in contrast more challenging. Here we consider a concept of ranking rule consensus based on the Kendall τ distance and show that, when it exists and is based on consistent ranking rules for the bipartite ranking subproblems defined by all consecutive pairs of labels, the latter forms a consistent ranking rule in the VUS sense under adequate conditions. This result paves the way for extending the use of recently developed learning algorithms, tailored for bipartite ranking, to multi-class data in a valid theoretical framework. Preliminary experimental results are presented for illustration purpose.

[1]  Stéphan Clémençon,et al.  Minimax Learning Rates for Bipartite Ranking and Plug-in Rules , 2011, ICML.

[2]  Bernard De Baets,et al.  On the ERA ranking representability of pairwise bipartite ranking functions , 2011, Artif. Intell..

[3]  Stéphan Clémençon,et al.  Adaptive partitioning schemes for bipartite ranking , 2011, Machine Learning.

[4]  Elena Montañés,et al.  Adapting Decision DAGs for Multipartite Ranking , 2010, ECML/PKDD.

[5]  N. Vayatis,et al.  Overlaying Classifiers: A Practical Approach to Optimal Scoring , 2010 .

[6]  Jialiang Li,et al.  Nonparametric and semiparametric estimation of the three way receiver operating characteristic surface , 2009 .

[7]  Stéphan Clémençon,et al.  Adaptive Estimation of the Optimal ROC Curve and a Bipartite Ranking Algorithm , 2009, ALT.

[8]  Stéphan Clémençon,et al.  Tree-Based Ranking Methods , 2009, IEEE Transactions on Information Theory.

[9]  Eyke Hüllermeier,et al.  Binary Decomposition Methods for Multipartite Ranking , 2009, ECML/PKDD.

[10]  Stéphan Clémençon,et al.  Nonparametric estimation of the precision-recall curve , 2009, ICML '09.

[11]  Stéphan Clémençon,et al.  On Partitioning Rules for Bipartite Ranking , 2009, AISTATS.

[12]  Marina Meila,et al.  Tractable Search for Learning Exponential Models of Rankings , 2009, AISTATS.

[13]  David Fernández-Baca,et al.  Computing distances between partial rankings , 2009, Inf. Process. Lett..

[14]  Nicolas Vayatis,et al.  R-implementation of the TreeRank algorithm , 2009 .

[15]  Shivani Agarwal Generalization Bounds for Some Ordinal Regression Algorithms , 2008, ALT.

[16]  Olivier Hudry,et al.  NP-hardness results for the aggregation of linear orders into median orders , 2008, Ann. Oper. Res..

[17]  Bernard De Baets,et al.  ROC analysis in ordinal regression learning , 2008, Pattern Recognit. Lett..

[18]  Eyke Hüllermeier,et al.  Is an ordinal class structure useful in classifier learning? , 2008, Int. J. Data Min. Model. Manag..

[19]  Jeff A. Bilmes,et al.  Consensus ranking under the exponential model , 2007, UAI.

[20]  A. Tsybakov,et al.  Fast learning rates for plug-in classifiers , 2007, 0708.2321.

[21]  N. Vayatis,et al.  Ranking the Best Instances , 2006, J. Mach. Learn. Res..

[22]  T. Salakoski,et al.  Learning to Rank with Pairwise Regularized Least-Squares , 2007 .

[23]  Cynthia Rudin,et al.  Ranking with a P-Norm Push , 2006, COLT.

[24]  Jonathan E. Fieldsend,et al.  Multi-class ROC analysis from a multi-objective optimisation perspective , 2006, Pattern Recognit. Lett..

[25]  Josep Domingo-Ferrer,et al.  Regression for ordinal variables without underlying continuous variables , 2006, Inf. Sci..

[26]  Robert P. W. Duin,et al.  A simplified extension of the Area under the ROC to the multiclass domain , 2006 .

[27]  Dan Roth,et al.  Generalization Bounds for the Area Under the ROC Curve , 2005, J. Mach. Learn. Res..

[28]  John Langford,et al.  Weighted One-Against-All , 2005, AAAI.

[29]  P. Qiu The Statistical Evaluation of Medical Tests for Classification and Prediction , 2005 .

[30]  Robert M. Nishikawa,et al.  The hypervolume under the ROC hypersurface of "Near-Guessing" and "Near-Perfect" observers in N-class classification tasks , 2005, IEEE Transactions on Medical Imaging.

[31]  Stephen E. Fienberg,et al.  Testing Statistical Hypotheses , 2005 .

[32]  S. Rajaram,et al.  Generalization Bounds for k-Partite Ranking , 2005 .

[33]  Jonathan E. Fieldsend,et al.  Formulation and comparison of multi-class ROC surfaces , 2005 .

[34]  C. Yiannoutsos,et al.  Ordered multiple‐class ROC analysis with continuous measurements , 2004, Statistics in medicine.

[35]  Ronald Fagin,et al.  Comparing and aggregating rankings with ties , 2004, PODS '04.

[36]  David J. Hand,et al.  A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems , 2001, Machine Learning.

[37]  Thomas P. Hayes,et al.  Reductions Between Classification Tasks , 2004, Electron. Colloquium Comput. Complex..

[38]  José Hernández-Orallo,et al.  Volume under the ROC Surface for Multi-class Problems , 2003, ECML.

[39]  Peter A. Flach,et al.  Improving Accuracy and Cost of Two-class and Multi-class Probabilistic Classifiers Using ROC Curves , 2003, ICML.

[40]  Johannes Fürnkranz,et al.  Round Robin Classification , 2002, J. Mach. Learn. Res..

[41]  John D. Lafferty,et al.  Conditional Models on the Ranking Poset , 2002, NIPS.

[42]  A. W. van der Vaart,et al.  Uniform Central Limit Theorems , 2001 .

[43]  M. Binder,et al.  Comparing Three-class Diagnostic Tests by Three-way ROC Analysis , 2000, Medical decision making : an international journal of the Society for Medical Decision Making.

[44]  Yoram Singer,et al.  Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers , 2000, J. Mach. Learn. Res..

[45]  Ralf Herbrich,et al.  Large margin rank boundaries for ordinal regression , 2000 .

[46]  Rafael Martí,et al.  Intensification and diversification with elite tabu search solutions for the linear ordering problem , 1999, Comput. Oper. Res..

[47]  Vladimir Vapnik,et al.  An overview of statistical learning theory , 1999, IEEE Trans. Neural Networks.

[48]  Venkatesan Guruswami,et al.  Multiclass learning, boosting, and error-correcting codes , 1999, COLT '99.

[49]  D. Mossman Three-way ROCs , 1999, Medical decision making : an international journal of the Society for Medical Decision Making.

[50]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[51]  Yoshiko Wakabayashi The Complexity of Computing Medians of Relations , 1998 .

[52]  Robert Tibshirani,et al.  Classification by Pairwise Coupling , 1997, NIPS.

[53]  Scurfield Multiple-Event Forced-Choice Tasks in the Theory of Signal Detectability , 1996, Journal of mathematical psychology.

[54]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[55]  A. Guénoche,et al.  Median linear orders: Heuristics and a branch and bound algorithm , 1989 .

[56]  Bernard Monjardet,et al.  The median procedure in cluster analysis and social choice theory , 1981, Math. Soc. Sci..

[57]  D. M. Green,et al.  Signal detection theory and psychophysics , 1966 .