Unsupervised rank aggregation with distance-based models

The need to meaningfully combine sets of rankings often comes up when one deals with ranked data. Although a number of heuristic and supervised learning approaches to rank aggregation exist, they require domain knowledge or supervised ranked data, both of which are expensive to acquire. In order to address these limitations, we propose a mathematical and algorithmic framework for learning to aggregate (partial) rankings without supervision. We instantiate the framework for the cases of combining permutations and combining top-k lists, and propose a novel metric for the latter. Experiments in both scenarios demonstrate the effectiveness of the proposed formalism.

[1]  M. Kendall A NEW MEASURE OF RANK CORRELATION , 1938 .

[2]  C. L. Mallows NON-NULL RANKING MODELS. I , 1957 .

[3]  William Feller,et al.  An Introduction to Probability Theory and Its Applications , 1967 .

[4]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[5]  R. Graham,et al.  Spearman's Footrule as a Measure of Disarray , 1977 .

[6]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[7]  D. Critchlow Metric Methods for Analyzing Partially Ranked Data , 1986 .

[8]  M. Fligner,et al.  Distance Based Ranking Models , 1986 .

[9]  Edward A. Fox,et al.  Combination of Multiple Searches , 1993, TREC.

[10]  Donna K. Harman,et al.  Overview of the Second Text REtrieval Conference (TREC-2) , 1994, HLT.

[11]  Derick Wood,et al.  Right Invariant Metrics and Measures of Presortedness , 1993, Discret. Appl. Math..

[12]  Donna K. Harman,et al.  Overview of the Third Text REtrieval Conference (TREC-3) , 1995, TREC.

[13]  D. K. Harmon,et al.  Overview of the Third Text Retrieval Conference (TREC-3) , 1996 .

[14]  Persi Diaconis,et al.  What Do We Know about the Metropolis Algorithm? , 1998, J. Comput. Syst. Sci..

[15]  Thorsten Joachims,et al.  Unbiased Evaluation of Retrieval Quality using Clickthrough Data , 2002 .

[16]  John D. Lafferty,et al.  Conditional Models on the Ranking Poset , 2002, NIPS.

[17]  John D. Lafferty,et al.  Cranking: Combining Rankings Using Conditional Probability Models on Permutations , 2002, ICML.

[18]  Ronald Fagin,et al.  Comparing top k lists , 2003, SODA '03.

[19]  Dan Roth,et al.  An Unsupervised Learning Algorithm for Rank Aggregation , 2007, ECML.

[20]  Joachim M. Buhmann,et al.  Cluster analysis of heterogeneous rank data , 2007, ICML '07.

[21]  Richard M. Schwartz,et al.  Combining Outputs from Multiple Machine Translation Systems , 2007, NAACL.

[22]  Tao Qin,et al.  Supervised rank aggregation , 2007, WWW '07.