Beyond majority: Label ranking ensembles based on voting rules

Abstract Label ranking is a machine learning task that deals with mapping an instance to a ranking of labels, representing the labels’ ordered relevance to the instance. Three recent studies have suggested the use of ensembles to improve the performance of simple label ranking models. However, none of them has explicitly inspected the question of how should the results obtained by simple models be aggregated into a single combined output. While classification tasks and regression tasks typically employ trivial aggregation techniques (i.e., majority voting and averaging, respectively), the case of label ranking tasks is not straightforward. The contribution of this paper is twofold. First, we propose to apply voting rules, typically used in the field of social choice, as the aggregation technique for label ranking ensembles. Our evaluation reveals that there is no single rule that consistently outperforms all other voting rules, and that under different settings different voting rules perform the best. Second, we propose a novel aggregation method for label ranking ensembles that learns the best voting rule to be used in a given setting. An extensive evaluation of the proposed method on semi-synthetic as well as real-world datasets shows that it obtains prediction performance that is significantly higher than that of the aggregation techniques currently used by state-of-the-art label ranking ensembles.

[1]  Nicolas de Condorcet Essai Sur L'Application de L'Analyse a la Probabilite Des Decisions Rendues a la Pluralite Des Voix , 2009 .

[2]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[3]  Louis Vuurpijl,et al.  An overview and comparison of voting methods for pattern recognition , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[4]  Alex Gershkov,et al.  Optimal Voting Rules , 2013 .

[5]  P.-C.-F. Daunou,et al.  Mémoire sur les élections au scrutin , 1803 .

[6]  K. Arrow Social Choice and Individual Values , 1951 .

[7]  Guoping Qiu,et al.  Random Forest for Label Ranking , 2016, Expert Syst. Appl..

[8]  Yoram Singer,et al.  Log-Linear Models for Label Ranking , 2003, NIPS.

[9]  Paulo Cortez,et al.  Label Ranking Forests , 2017, Expert Syst. J. Knowl. Eng..

[10]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[11]  Ariel D. Procaccia,et al.  Better Human Computation Through Principled Voting , 2013, AAAI.

[12]  Eyke Hüllermeier,et al.  Clustering of gene expression data using a local shape-based similarity measure , 2005, Bioinform..

[13]  Eyke Hüllermeier,et al.  Decision tree and instance-based learning for label ranking , 2009, ICML '09.

[14]  Eyke Hüllermeier,et al.  Labelwise versus Pairwise Decomposition in Label Ranking , 2013, LWA.

[15]  R. Polikar,et al.  Ensemble based systems in decision making , 2006, IEEE Circuits and Systems Magazine.

[16]  Francisco Herrera,et al.  Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power , 2010, Inf. Sci..

[17]  Stefan Wrobel,et al.  One click mining: interactive local pattern discovery through implicit preference and performance learning , 2013, IDEA@KDD.

[18]  Javed A. Aslam,et al.  Models for metasearch , 2001, SIGIR '01.

[19]  Ariel D. Procaccia,et al.  Voting rules as error-correcting codes , 2015, Artif. Intell..

[20]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[21]  H. Young Optimal Voting Rules , 1995 .

[22]  Ariel D. Procaccia,et al.  Modal Ranking: A Uniquely Robust Voting Rule , 2014, AAAI.

[23]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[24]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[25]  Grigorios Tsoumakas,et al.  Random k -Labelsets: An Ensemble Method for Multilabel Classification , 2007, ECML.

[26]  Mathijs de Weerdt,et al.  Minimising the rank aggregation error , 2016 .

[27]  Lior Rokach,et al.  Decision forest: Twenty years of research , 2016, Inf. Fusion.

[28]  José A. Gámez,et al.  Tackling the supervised label ranking problem by bagging weak learners , 2017, Inf. Fusion.

[29]  Meir Kalech,et al.  Preference Elicitation for Group Decisions Using the Borda Voting Rule , 2015, Group Decision and Negotiation.

[30]  Lior Rokach,et al.  Ensemble learning: A survey , 2018, WIREs Data Mining Knowl. Discov..

[31]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[32]  Tie-Yan Liu,et al.  Learning to rank for information retrieval , 2009, SIGIR.

[33]  Ian H. Witten,et al.  Stacking Bagged and Dagged Models , 1997, ICML.

[34]  Philippe Fortemps,et al.  Alternative Decomposition Techniques for Label Ranking , 2014, IPMU.

[35]  Ariel D. Procaccia,et al.  When do noisy votes reveal the truth? , 2013, EC '13.

[36]  Yuval Kluger,et al.  Ranking and combining multiple predictors without labeled data , 2013, Proceedings of the National Academy of Sciences.

[37]  Yoram Singer,et al.  Learning to Order Things , 1997, NIPS.

[38]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[39]  Sébastien Destercke,et al.  Cautious label ranking with label-wise decomposition , 2015, Eur. J. Oper. Res..

[40]  Yangguang Liu,et al.  A Taxonomy of Label Ranking Algorithms , 2014, J. Comput..

[41]  João Gama,et al.  Cascade Generalization , 2000, Machine Learning.

[42]  Ben Carterette,et al.  Learning a ranking from pairwise preferences , 2006, SIGIR '06.

[43]  Yann Chevaleyre,et al.  A Short Introduction to Computational Social Choice , 2007, SOFSEM.

[44]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[45]  Carlos Soares,et al.  Discovering a taste for the unusual: exceptional models for preference mining , 2018, Machine Learning.

[46]  Francis K. H. Quek,et al.  Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets , 2003, Pattern Recognit..

[47]  Carlos Soares,et al.  Distance-Based Decision Tree Algorithms for Label Ranking , 2015, EPIA.

[48]  Marina Meila,et al.  Experiments with Kemeny ranking: What works when? , 2012, Math. Soc. Sci..

[49]  P. Fishburn,et al.  Voting Procedures , 2022 .

[50]  Dan Roth,et al.  Constraint Classification for Multiclass Classification and Ranking , 2002, NIPS.

[51]  Joan Claudi Socoró,et al.  Positional and confidence voting-based consensus functions for fuzzy cluster ensembles , 2012, Fuzzy Sets Syst..

[52]  Carlos Soares,et al.  A Similarity-Based Adaptation of Naive Bayes for Label Ranking: Application to the Metalearning Problem of Algorithm Recommendation , 2010, Discovery Science.

[53]  Thomas Gärtner,et al.  Label Ranking Algorithms: A Survey , 2010, Preference Learning.

[54]  H. Young Extending Condorcet's rule , 1977 .

[55]  Josef Kittler,et al.  Combining multiple classifiers by averaging or by multiplying? , 2000, Pattern Recognit..

[56]  Toshihiro Kamishima,et al.  Nantonac collaborative filtering: recommendation based on order responses , 2003, KDD '03.

[57]  Carlos Soares,et al.  Ranking Learning Algorithms: Using IBL and Meta-Learning on Accuracy and Time Results , 2003, Machine Learning.

[58]  M. Kendall A NEW MEASURE OF RANK CORRELATION , 1938 .

[59]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[60]  Eyke Hüllermeier,et al.  Label ranking by learning pairwise preferences , 2008, Artif. Intell..

[61]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[62]  P. Bühlmann,et al.  Boosting With the L2 Loss , 2003 .

[63]  Lior Rokach,et al.  Recommender Systems: Introduction and Challenges , 2015, Recommender Systems Handbook.

[64]  Erez Shmueli,et al.  An Information Theory Subspace Analysis Approach with Application to Anomaly Detection Ensembles , 2017, KDIR.

[65]  B. Yu,et al.  Boosting with the L_2-Loss: Regression and Classification , 2001 .

[66]  Qiang Wu,et al.  Learning to Rank Using an Ensemble of Lambda-Gradient Models , 2010, Yahoo! Learning to Rank Challenge.

[67]  Moni Naor,et al.  Rank aggregation methods for the Web , 2001, WWW '01.

[68]  R. Forthofer,et al.  Rank Correlation Methods , 1981 .