Minimax-optimal Inference from Partial Rankings

This paper studies the problem of rank aggregation under the Plackett-Luce model. The goal is to infer a global ranking and related scores of the items, based on partial rankings provided by multiple users over multiple subsets of items. A question of particular interest is how to optimally assign items to users for ranking and how many item assignments are needed to achieve a target estimation error. Without any assumptions on how the items are assigned to users, we derive an oracle lower bound and the Cramer-Rao lower bound of the estimation error. We prove an upper bound on the estimation error achieved by the maximum likelihood estimator, and show that both the upper bound and the Cramer-Rao lower bound inversely depend on the spectral gap of the Laplacian of an appropriately defined comparison graph. Since random comparison graphs are known to have large spectral gaps, this suggests the use of random assignments when we have the control. Precisely, the matching oracle lower bound and the upper bound on the estimation error imply that the maximum likelihood estimator together with a random assignment is minimax-optimal up to a logarithmic factor. We further analyze a popular rank-breaking scheme that decompose partial rankings into pairwise comparisons. We show that even if one applies the mismatched maximum likelihood estimator that assumes independence (on pairwise comparisons that are now dependent due to rank-breaking), minimax optimal performance is still achieved up to a logarithmic factor.

[1]  D. McFadden Econometric Models for Probabilistic Choice Among Products , 1980 .

[2]  Moshe Ben-Akiva,et al.  Discrete Choice Analysis: Theory and Application to Travel Demand , 1985 .

[3]  D. Curtis,et al.  An extended transmission/disequilibrium test (TDT) for multi‐allele marker loci , 1995, Annals of human genetics.

[4]  R. Gill,et al.  Applications of the van Trees inequality : a Bayesian Cramr-Rao bound , 1995 .

[5]  Yi-Ching Yao,et al.  Asymptotics when the number of parameters tends to infinity in the Bradley-Terry model for paired comparisons , 1999 .

[6]  Thomas P. Hayes A large-deviation inequality for vector-valued martingales , 2003 .

[7]  D. Hunter MM algorithms for generalized Bradley-Terry models , 2003 .

[8]  Devavrat Shah,et al.  Inferring rankings under constrained sensing , 2008, NIPS.

[9]  John D. C. Little,et al.  A Logit Model of Brand Choice Calibrated on Scanner Data , 2011, Mark. Sci..

[10]  Mark Braverman,et al.  Sorting from Noisy Information , 2009, ArXiv.

[11]  John Guiver,et al.  Bayesian inference for Plackett-Luce ranking models , 2009, ICML '09.

[12]  Michael I. Jordan,et al.  On the Consistency of Ranking Algorithms , 2010, ICML.

[13]  Tao Qin,et al.  A New Probabilistic Model for Rank Aggregation , 2010, NIPS.

[14]  David C. Parkes,et al.  Random Utility Theory for Social Choice , 2012, NIPS.

[15]  Joel A. Tropp,et al.  User-Friendly Tail Bounds for Sums of Random Matrices , 2010, Found. Comput. Math..

[16]  David C. Parkes,et al.  Preference Elicitation For General Random Utility Models , 2013, UAI.

[17]  David C. Parkes,et al.  Generalized Method-of-Moments for Rank Aggregation , 2013, NIPS.

[18]  Arun Rajkumar,et al.  A Statistical Convergence Perspective of Algorithms for Rank Aggregation from Pairwise Data , 2014, ICML.

[19]  David C. Parkes,et al.  Computing Parametric Ranking Models via Rank-Breaking , 2014, ICML.