Statistical Inference for Incomplete Ranking Data: The Case of Rank-Dependent Coarsening

We consider the problem of statistical inference for ranking data, specifically rank aggregation, under the assumption that samples are incomplete in the sense of not comprising all choice alternatives. In contrast to most existing methods, we explicitly model the process of turning a full ranking into an incomplete one, which we call the coarsening process. To this end, we propose the concept of rank-dependent coarsening, which assumes that incomplete rankings are produced by projecting a full ranking to a random subset of ranks. For a concrete instantiation of our model, in which full rankings are drawn from a Plackett-Luce distribution and observations take the form of pairwise preferences, we study the performance of various rank aggregation methods. In addition to predictive accuracy in the finite sample setting, we address the theoretical question of consistency, by which we mean the ability to recover a target ranking when the sample size goes to infinity, despite a potential bias in the observations caused by the (unknown) coarsening.

[1]  Fedor V. Fomin,et al.  Fast Local Search Algorithm for Weighted Feedback Arc Set in Tournaments , 2010, AAAI.

[2]  Youssef Saab,et al.  A Fast and Effective Algorithm for the Feedback Arc Set Problem , 2001, J. Heuristics.

[3]  R. Plackett The Analysis of Permutations , 1975 .

[4]  Toshihiro Kamishima,et al.  Nantonac collaborative filtering: recommendation based on order responses , 2003, KDD '03.

[5]  Johannes Fürnkranz,et al.  Round Robin Classification , 2002, J. Mach. Learn. Res..

[6]  Robert Tibshirani,et al.  Classification by Pairwise Coupling , 1997, NIPS.

[7]  Gérard Dreyfus,et al.  Pairwise Neural Network Classifiers with Probabilistic Outputs , 1994, NIPS.

[8]  Yuan Yao,et al.  Statistical ranking and combinatorial Hodge theory , 2008, Math. Program..

[9]  Tie-Yan Liu,et al.  Learning to Rank for Information Retrieval , 2011 .

[10]  James M. Robins,et al.  Coarsening at Random: Characterizations, Conjectures, Counter-Examples , 1997 .

[11]  Chih-Jen Lin,et al.  Probability Estimates for Multi-class Classification by Pairwise Coupling , 2003, J. Mach. Learn. Res..

[12]  R. A. Bradley,et al.  RANK ANALYSIS OF INCOMPLETE BLOCK DESIGNS , 1952 .

[13]  Jérémie Jakubowicz,et al.  MRA-based Statistical Learning from Incomplete Rankings , 2015, ICML.

[14]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[15]  J. Marden Analyzing and Modeling Rank Data , 1996 .

[16]  P.-C.-F. Daunou,et al.  Mémoire sur les élections au scrutin , 1803 .

[17]  Paul N. Bennett,et al.  Pairwise ranking aggregation in a crowdsourced setting , 2013, WSDM.

[18]  R. Luce,et al.  Individual Choice Behavior: A Theoretical Analysis. , 1960 .

[19]  Yi Mao,et al.  Non-parametric Modeling of Partially Ranked Data , 2007, NIPS.

[20]  R. Duncan Luce,et al.  Individual Choice Behavior: A Theoretical Analysis , 1979 .

[21]  R. A. Bradley,et al.  Rank Analysis of Incomplete Block Designs: I. The Method of Paired Comparisons , 1952 .

[22]  Arun Rajkumar,et al.  A Statistical Convergence Perspective of Algorithms for Rank Aggregation from Pairwise Data , 2014, ICML.

[23]  David C. Parkes,et al.  Computing Parametric Ranking Models via Rank-Breaking , 2014, ICML.

[24]  J. Seeley The net of reciprocal influence; a problem in treating sociometric data. , 1949 .

[25]  D. Rubin,et al.  Ignorability and Coarse Data , 1991 .

[26]  Sebastiano Vigna,et al.  Spectral ranking , 2009, Network Science.

[27]  Devavrat Shah,et al.  Iterative ranking from pair-wise comparisons , 2012, NIPS.

[28]  C. L. Mallows NON-NULL RANKING MODELS. I , 1957 .

[29]  Eyke Hüllermeier,et al.  Label ranking by learning pairwise preferences , 2008, Artif. Intell..

[30]  Moni Naor,et al.  Rank aggregation methods for the Web , 2001, WWW '01.