Generalized Rank-Breaking: Computational and Statistical Tradeoffs

For massive and heterogeneous modern datasets, it is of fundamental interest to provide guarantees on the accuracy of estimation when computational resources are limited. In the application of rank aggregation, for the Plackett-Luce model, we provide a hierarchy of rank-breaking mechanisms ordered by the complexity in thus generated sketch of the data. This allows the number of data points collected to be gracefully traded offs against computational resources available, while guaranteeing the desired level of accuracy. Theoretical guarantees on the proposed generalized rank-breaking implicitly provide such trade-offs, which can be explicitly characterized under certain canonical scenarios on the structure of the data. Further, the proposed generalized rank-breaking algorithm involves set-wise comparisons as opposed to traditional pairwise comparisons. The maximum likelihood estimate of pairwise comparisons is computed efficiently using the celebrated minorization maximization algorithm (Hunter, 2004). To compute the pseudo-maximum likelihood estimate of the set-wise comparisons, we provide a generalization of the minorization maximization algorithm and give guarantees on its convergence.

[1]  Ashish Khetan,et al.  Data-driven Rank Breaking for Efficient Rank Aggregation , 2016, J. Mach. Learn. Res..

[2]  Devavrat Shah,et al.  Rank Centrality: Ranking from Pairwise Comparisons , 2012, Oper. Res..

[3]  Yi-Ching Yao,et al.  Asymptotics when the number of parameters tends to infinity in the Bradley-Terry model for paired comparisons , 1999 .

[4]  Nathan Srebro,et al.  SVM optimization: inverse dependence on training set size , 2008, ICML '08.

[5]  D. Hunter MM algorithms for generalized Bradley-Terry models , 2003 .

[6]  Andreas Krause,et al.  Tradeoffs for Space, Time, Data and Risk in Unsupervised Learning , 2015, AISTATS.

[7]  David C. Parkes,et al.  Computing Parametric Ranking Models via Rank-Breaking , 2014, ICML.

[8]  Léon Bottou,et al.  The Tradeoffs of Large Scale Learning , 2007, NIPS.

[9]  Andrzej Lingas,et al.  Faster algorithms for finding lowest common ancestors in directed acyclic graphs , 2007, Theor. Comput. Sci..

[10]  Martin J. Wainwright,et al.  Stochastically Transitive Models for Pairwise Comparisons: Statistical and Computational Issues , 2015, IEEE Transactions on Information Theory.

[11]  Thomas P. Hayes A large-deviation inequality for vector-valued martingales , 2003 .

[12]  Stochastic Programming,et al.  Logarithmic Concave Measures and Related Topics , 1980 .

[13]  Bruce E. Hajek,et al.  Minimax-optimal Inference from Partial Rankings , 2014, NIPS.

[14]  Andrea Montanari,et al.  Improved Sum-of-Squares Lower Bounds for Hidden Clique and Hidden Submatrix Problems , 2015, COLT.

[15]  Kenneth Y. Goldberg,et al.  Eigentaste: A Constant Time Collaborative Filtering Algorithm , 2001, Information Retrieval.

[16]  Marina Meila,et al.  Experiments with Kemeny ranking: What works when? , 2012, Math. Soc. Sci..

[17]  L. R. Ford Solution of a Ranking Problem from Binary Comparisons , 1957 .

[18]  Yuxin Chen,et al.  Spectral MLE: Top-K Rank Aggregation from Pairwise Comparisons , 2015, ICML.

[19]  Rolf Niedermeier,et al.  Theoretical and empirical evaluation of data reduction for exact Kemeny Rank Aggregation , 2014, Autonomous Agents and Multi-Agent Systems.

[20]  David C. Parkes,et al.  Random Utility Theory for Social Choice , 2012, NIPS.

[21]  Michael I. Jordan,et al.  Computational and statistical tradeoffs via convex relaxation , 2012, Proceedings of the National Academy of Sciences.

[22]  Matthias Grossglauser,et al.  Fast and Accurate Inference of Plackett-Luce Models , 2015, NIPS.

[23]  E. Zermelo Die Berechnung der Turnier-Ergebnisse als ein Maximumproblem der Wahrscheinlichkeitsrechnung , 1929 .

[24]  Avi Wigderson,et al.  Sum-of-squares Lower Bounds for Planted Clique , 2015, STOC.

[25]  Toshihiro Kamishima,et al.  Nantonac collaborative filtering: recommendation based on order responses , 2003, KDD '03.

[26]  Peter L. Bartlett,et al.  Oracle inequalities for computationally adaptive model selection , 2012, ArXiv.

[27]  L. Bottou,et al.  Generalized Method-of-Moments for Rank Aggregation , 2013 .

[28]  Steven Skiena,et al.  Lowest common ancestors in trees and directed acyclic graphs , 2005, J. Algorithms.

[29]  Martin J. Wainwright,et al.  Estimation from Pairwise Comparisons: Sharp Minimax Bounds with Topology Dependence , 2015, J. Mach. Learn. Res..