Listwise Approach for Rank Aggregation in Crowdsourcing

Inferring a gold-standard ranking over a set of objects, such as documents or images, is a key task to build test collections for various applications like Web search and recommender systems. Crowdsourcing services provide an efficient and inexpensive way to collect judgments via labeling by sets of annotators. We thus study the problem of finding a consensus ranking from crowdsourced judgments. In contrast to conventional rank aggregation methods which minimize the distance between predicted ranking and input judgments from either pointwise or pairwise perspective, we argue that it is critical to consider the distance in a listwise way to emphasize the position importance in ranking. Therefore, we introduce a new listwise approach in this paper, where ranking measure based objective functions are utilized for optimization. In addition, we also incorporate the annotator quality into our model since the reliability of annotators can vary significantly in crowdsourcing. For optimization, we transform the optimization problem to the Linear Sum Assignment Problem, and then solve it by a very efficient algorithm named CrowdAgg guaranteeing the optimal solution. Experimental results on two benchmark data sets from different crowdsourcing tasks show that our algorithm is much more effective, efficient and robust than traditional methods.

[1]  L. Thurstone The method of paired comparisons for social values , 1927 .

[2]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[3]  Ronald Fagin,et al.  Efficient similarity search and classification via rank aggregation , 2003, SIGMOD '03.

[4]  Mohamed Farah,et al.  An outranking approach for rank aggregation in information retrieval , 2007, SIGIR.

[5]  Jie Wu,et al.  A document rating system for preference judgements , 2013, SIGIR.

[6]  Maria Huhtala,et al.  Random Variables and Stochastic Processes , 2021, Matrix and Tensor Decompositions in Signal Processing.

[7]  R. Plackett The Analysis of Permutations , 1975 .

[8]  Athanasios Papoulis,et al.  Probability, Random Variables and Stochastic Processes , 1965 .

[9]  Mónica Marrero,et al.  Crowdsourcing Preference Judgments for Evaluation of Music Similarity Tasks , 2010 .

[10]  Yoram Singer,et al.  Learning to Order Things , 1997, NIPS.

[11]  David Maxwell Chickering,et al.  Here or There , 2008, ECIR.

[12]  Xueqi Cheng,et al.  Stochastic Rank Aggregation , 2013, UAI.

[13]  Maksims Volkovs,et al.  A flexible generative model for preference aggregation , 2012, WWW.

[14]  Harold W. Kuhn,et al.  The Hungarian method for the assignment problem , 1955, 50 Years of Integer Programming.

[15]  David Maxwell Chickering,et al.  Here or there: preference judgments for relevance , 2008 .

[16]  Javed A. Aslam,et al.  Condorcet fusion for improved retrieval , 2002, CIKM '02.

[17]  Javed A. Aslam,et al.  Models for metasearch , 2001, SIGIR '01.

[18]  Jaana Kekäläinen,et al.  Binary and graded relevance in IR evaluations--Comparison of the effects on ranking of IR systems , 2005, Inf. Process. Manag..

[19]  Bruno O. Shubert,et al.  Random variables and stochastic processes , 1979 .

[20]  Xueqi Cheng,et al.  Top-k learning to rank: labeling, ranking and evaluation , 2012, SIGIR '12.

[21]  John D. Lafferty,et al.  Cranking: Combining Rankings Using Conditional Probability Models on Permutations , 2002, ICML.

[22]  Paul N. Bennett,et al.  Pairwise ranking aggregation in a crowdsourced setting , 2013, WSDM.

[23]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[24]  Alexander J. Smola,et al.  Direct Optimization of Ranking Measures , 2007, ArXiv.

[25]  Remco C. Veltkamp,et al.  A Ground Truth For Half A Million Musical Incipits , 2005, J. Digit. Inf. Manag..

[26]  R. Jackson Inequalities , 2007, Algebra for Parents.

[27]  Alistair Moffat,et al.  Rank-biased precision for measurement of retrieval effectiveness , 2008, TOIS.

[28]  Jie Wu Applying EM to Compute Document Relevance from Crowdsourced Pair Preferences , 2013 .

[29]  David F. Gleich,et al.  Rank aggregation via nuclear norm minimization , 2011, KDD.

[30]  John Guiver,et al.  Bayesian inference for Plackett-Luce ranking models , 2009, ICML '09.