HodgeRank With Information Maximization for Crowdsourced Pairwise Ranking Aggregation

Recently, crowdsourcing has emerged as an effective paradigm for human-powered large scale problem solving in various domains. However, task requester usually has a limited amount of budget, thus it is desirable to have a policy to wisely allocate the budget to achieve better quality. In this paper, we study the principle of information maximization for active sampling strategies in the framework of HodgeRank, an approach based on Hodge Decomposition of pairwise ranking data with multiple workers. The principle exhibits two scenarios of active sampling: Fisher information maximization that leads to unsupervised sampling based on a sequential maximization of graph algebraic connectivity without considering labels; and Bayesian information maximization that selects samples with the largest information gain from prior to posterior, which gives a supervised sampling involving the labels collected. Experiments show that the proposed methods boost the sampling efficiency as compared to traditional sampling schemes and are thus valuable to practical crowdsourcing experiments.

[1]  Patrick Le Callet,et al.  Subjective quality assessment IRCCyN/IVC database , 2004 .

[2]  M. Bartlett An Inverse Matrix Adjustment Arising in Discriminant Analysis , 1951 .

[3]  Yuan Yao,et al.  Statistical ranking and combinatorial Hodge theory , 2008, Math. Program..

[4]  Tie-Yan Liu,et al.  Learning to Rank for Information Retrieval , 2011 .

[5]  Xi Chen,et al.  Statistical Decision Making for Optimal Budget Allocation in Crowd Labeling , 2014, J. Mach. Learn. Res..

[6]  Jakub W. Pachocki,et al.  Solving SDD linear systems in nearly mlog1/2n time , 2014, STOC.

[7]  Thomas Pfeiffer,et al.  Adaptive Polling for Information Aggregation , 2012, AAAI.

[8]  Xiaochun Cao,et al.  False Discovery Rate Control and Statistical Quality Assessment of Annotators in Crowdsourced Ranking , 2016, ICML.

[9]  C. F. Kossack,et al.  Rank Correlation Methods , 1949 .

[10]  Pablo A. Parrilo,et al.  Convex graph invariants , 2012, CISS.

[11]  Stanley Osher,et al.  Optimal data collection for informative rankings expose well-connected graphs , 2012, J. Mach. Learn. Res..

[12]  Qingming Huang,et al.  Random partial paired comparison for subjective video quality assessment via hodgerank , 2011, ACM Multimedia.

[14]  Herbert Edelsbrunner,et al.  Topological persistence and simplification , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[15]  Gunnar E. Carlsson,et al.  Topology and data , 2009 .

[16]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[17]  Paul N. Bennett,et al.  Pairwise ranking aggregation in a crowdsourced setting , 2013, WSDM.

[18]  Tao Xiang,et al.  Interestingness Prediction by Robust Learning to Rank , 2014, ECCV.

[19]  Stephen P. Boyd,et al.  Growing Well-connected Graphs , 2006, Proceedings of the 45th IEEE Conference on Decision and Control.

[20]  Shang-Hua Teng,et al.  Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems , 2003, STOC '04.

[21]  Donald G. Saari,et al.  Chaotic Elections! - A Mathematician Looks at Voting , 2001 .