A Statistical Convergence Perspective of Algorithms for Rank Aggregation from Pairwise Data

There has been much interest recently in the problem of rank aggregation from pairwise data. A natural question that arises is: under what sorts of statistical assumptions do various rank aggregation algorithms converge to an 'optimal' ranking? In this paper, we consider this question in a natural setting where pairwise comparisons are drawn randomly and independently from some underlying probability distribution. We first show that, under a 'time-reversibility' or Bradley-Terry-Luce (BTL) condition on the distribution, the rank centrality (PageRank) and least squares (HodgeRank) algorithms both converge to an optimal ranking. Next, we show that a matrix version of the Borda count algorithm, and more surprisingly, an algorithm which performs maximum likelihood estimation under a BTL assumption, both converge to an optimal ranking under a 'low-noise' condition that is strictly more general than BTL. Finally, we propose a new SVM-based algorithm for rank aggregation from pairwise data, and show that this converges to an optimal ranking under an even more general condition that we term 'generalized low-noise'. In all cases, we provide explicit sample complexity bounds for exact recovery of an optimal ranking. Our experiments confirm our theoretical findings and help to shed light on the statistical behavior of various rank aggregation algorithms.

[1]  Robert D. Nowak,et al.  Active Ranking using Pairwise Comparisons , 2011, NIPS.

[2]  Yuan Yao,et al.  Statistical ranking and combinatorial Hodge theory , 2008, Math. Program..

[3]  Tie-Yan Liu,et al.  Learning to Rank for Information Retrieval , 2011 .

[4]  Nicolas de Condorcet Essai Sur L'Application de L'Analyse a la Probabilite Des Decisions Rendues a la Pluralite Des Voix , 2009 .

[5]  Nir Ailon,et al.  Active Learning Ranking from Pairwise Preferences with Almost Optimal Query Complexity , 2011, NIPS.

[6]  P.-C.-F. Daunou,et al.  Mémoire sur les élections au scrutin , 1803 .

[7]  Maksims Volkovs,et al.  A flexible generative model for preference aggregation , 2012, WWW.

[8]  Dorit S. Hochbaum,et al.  Ranking Sports Teams and the Inverse Equal Paths Problem , 2006, WINE.

[9]  Nebojsa Jojic,et al.  Efficient Ranking from Pairwise Comparisons , 2013, ICML.

[10]  Stanley Osher,et al.  Enhanced statistical rankings via targeted data collection , 2013, ICML.

[11]  Tao Qin,et al.  A New Probabilistic Model for Rank Aggregation , 2010, NIPS.

[12]  Devavrat Shah,et al.  Inferring rankings under constrained sensing , 2008, NIPS.

[13]  Jeff A. Bilmes,et al.  Consensus ranking under the exponential model , 2007, UAI.

[14]  Tie-Yan Liu,et al.  Learning to rank for information retrieval , 2009, SIGIR.

[15]  Moni Naor,et al.  Rank aggregation methods for the Web , 2001, WWW '01.

[16]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[17]  Michael I. Jordan,et al.  On the Consistency of Ranking Algorithms , 2010, ICML.

[18]  Ralf Herbrich,et al.  Large margin rank boundaries for ordinal regression , 2000 .

[19]  Devavrat Shah,et al.  Iterative ranking from pair-wise comparisons , 2012, NIPS.

[20]  S. Kutin Extensions to McDiarmid's inequality when dierences are bounded with high probability , 2002 .

[21]  David C. Parkes,et al.  Random Utility Theory for Social Choice , 2012, NIPS.

[22]  Thore Graepel,et al.  Large Margin Rank Boundaries for Ordinal Regression , 2000 .

[23]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[24]  David F. Gleich,et al.  Rank aggregation via nuclear norm minimization , 2011, KDD.

[25]  John Guiver,et al.  Bayesian inference for Plackett-Luce ranking models , 2009, ICML '09.

[26]  Tong Zhang,et al.  Statistical Analysis of Bayes Optimal Subset Ranking , 2008, IEEE Transactions on Information Theory.

[27]  Dan Roth,et al.  Unsupervised rank aggregation with distance-based models , 2008, ICML '08.

[28]  Nir Ailon,et al.  Aggregation of Partial Rankings, p-Ratings and Top-m Lists , 2007, SODA '07.