论文信息 - Framework and Literature Analysis for Crowdsourcing’s Answer Aggregation

Framework and Literature Analysis for Crowdsourcing’s Answer Aggregation

ABSTRACT This paper presents a classification framework and a systematic analysis of literature on answer aggregation techniques for the most popular and important type of crowdsourcing, i.e., micro-task crowdsourcing. In doing so, we analyzed research articles since 2006 and developed four classification taxonomies. First, we provided a classification framework based on the algorithmic characteristics of answer aggregation techniques. Second, we outlined the statistical and probabilistic foundations used by different types of algorithms and micro-tasks. Third, we provided a matrix catalog of the data characteristics for which an answer aggregation algorithm is designed. Fourth, a matrix catalog of the commonly used evaluation metrics for each type of micro-task was presented. This paper represents the first systematic literature analysis and classification of the answer aggregation techniques for micro-task crowdsourcing.

[1] Lale Akarun,et al. Modeling annotator behaviors for crowd labeling , 2015, Neurocomputing.

[2] Fernando González-Ladrón-de-Guevara,et al. Towards an integrated crowdsourcing definition , 2012, J. Inf. Sci..

[3] Faiza Khan Khattak. Quality Control of Crowd Labeling through Expert Evaluation , 2011 .

[4] Bin Bi,et al. Iterative Learning for Reliable Crowdsourcing Systems , 2012 .

[5] F. Pearson. Systematic approaches to a successful literature review , 2014 .

[6] Subramanian Ramanathan,et al. Learning from multiple annotators with varying expertise , 2013, Machine Learning.

[7] Jaime G. Carbonell,et al. Efficiently learning the accuracy of labeling sources for selective sampling , 2009, KDD.

[8] Brian Mac Namee,et al. Dynamic estimation of worker reliability in crowdsourcing for regression tasks: Making it work , 2014, Expert Syst. Appl..

[9] Mark W. Schmidt,et al. Modeling annotator expertise: Learning when everybody knows a bit of something , 2010, AISTATS.

[10] Vaidy S. Sunderam,et al. Dynamic Data Driven Crowd Sensing Task Assignment , 2014, ICCS.

[11] Mohamed Quafafou,et al. Learning from Multiple Naive Annotators , 2012, ADMA.

[12] Silvana Castano,et al. Combining crowd consensus and user trustworthiness for managing collective tasks , 2016, Future Gener. Comput. Syst..

[13] Xindong Wu,et al. Imbalanced Multiple Noisy Labeling , 2015, IEEE Transactions on Knowledge and Data Engineering.

[14] Milad Shokouhi,et al. Community-based bayesian aggregation models for crowdsourcing , 2014, WWW.

[15] He Huang,et al. DATA: A double auction based task assignment mechanism in crowdsourcing systems , 2013, 2013 8th International Conference on Communications and Networking in China (CHINACOM).

[16] Jaeyoung Choi,et al. Creating Experts From the Crowd: Techniques for Finding Workers for Difficult Tasks , 2014, IEEE Transactions on Multimedia.

[17] Maxine Eskénazi,et al. Toward better crowdsourced transcription: Transcription of a year of the Let's Go Bus Information System data , 2010, 2010 IEEE Spoken Language Technology Workshop.

[18] Eric Horvitz,et al. Combining human and machine intelligence in large-scale crowdsourcing , 2012, AAMAS.

[19] George Kesidis,et al. Multicategory Crowdsourcing Accounting for Variable Task Difficulty, Worker Skill, and Worker Intention , 2015, IEEE Transactions on Knowledge and Data Engineering.

[20] Martin Schader,et al. Personalized task recommendation in crowdsourcing information systems - Current state of the art , 2014, Decis. Support Syst..

[21] Pietro Perona,et al. The Multidimensional Wisdom of Crowds , 2010, NIPS.

[22] Bernardete Ribeiro,et al. Learning from multiple annotators: Distinguishing good from random labelers , 2013, Pattern Recognit. Lett..

[23] Anirban Dasgupta,et al. Aggregating crowdsourced binary ratings , 2013, WWW.

[24] Tom Heskes,et al. Learning from Multiple Annotators with Gaussian Processes , 2011, ICANN.

[25] Tom Minka,et al. How To Grade a Test Without Knowing the Answers - A Bayesian Graphical Model for Adaptive Crowdsourcing and Aptitude Testing , 2012, ICML.

[26] Shipeng Yu,et al. Eliminating Spammers and Ranking Annotators for Crowdsourced Labeling Tasks , 2012, J. Mach. Learn. Res..

[27] Xi Chen,et al. Optimistic Knowledge Gradient Policy for Optimal Budget Allocation in Crowdsourcing , 2013, ICML.

[28] Gerardo Hermosillo,et al. Supervised learning from multiple experts: whom to trust when everyone lies a bit , 2009, ICML '09.

[29] Vana Kalogeraki,et al. On Task Assignment for Real-Time Reliable Crowdsourcing , 2014, 2014 IEEE 34th International Conference on Distributed Computing Systems.

[30] Chien-Ju Ho,et al. Adaptive Task Assignment for Crowdsourced Classification , 2013, ICML.

[31] Hisashi Kashima,et al. Clustering Crowds , 2013, AAAI.

[32] Javier R. Movellan,et al. Whose Vote Should Count More: Optimal Integration of Labels from Labelers of Unknown Expertise , 2009, NIPS.

[33] Devavrat Shah,et al. Efficient crowdsourcing for multi-class labeling , 2013, SIGMETRICS '13.

[34] Nicholas R. Jennings,et al. Efficient crowdsourcing of unknown experts using bounded multi-armed bandits , 2014, Artif. Intell..

[35] Jackie MacDonald,et al. Systematic Approaches to a Successful Literature Review , 2014 .

[36] Efraim Turban,et al. What can crowdsourcing do for decision support? , 2014, Decis. Support Syst..

[37] Nicholas R. Jennings,et al. Efficient Crowdsourcing of Unknown Experts using Multi-Armed Bandits , 2012, ECAI.

[38] Li Xiu,et al. Application of data mining techniques in customer relationship management: A literature review and classification , 2009, Expert Syst. Appl..

[39] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[40] Gerardo Hermosillo,et al. Learning From Crowds , 2010, J. Mach. Learn. Res..

[41] Hisashi Kashima,et al. Learning from Crowds and Experts , 2012, HCOMP@AAAI.

[42] William Yeoh,et al. Bee Colony Based Worker Reliability Estimation Algorithm in Microtask Crowdsourcing , 2016, 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA).

[43] Michael Vitale,et al. The Wisdom of Crowds , 2015, Cell.

[44] Carla E. Brodley,et al. Who Should Label What? Instance Allocation in Multiple Expert Active Learning , 2011, SDM.

[45] Panagiotis G. Ipeirotis,et al. Get another label? improving data quality and data mining using multiple, noisy labelers , 2008, KDD.

[46] Ted S. Sindlinger,et al. Crowdsourcing: Why the Power of the Crowd is Driving the Future of Business , 2010 .

[47] Jennifer G. Dy,et al. Active Learning from Crowds , 2011, ICML.

[48] Hisashi Kashima,et al. A Convex Formulation for Learning from Crowds , 2012, AAAI.

[49] Ilkka Kauranen,et al. Crowdsourcing: a comprehensive literature review , 2015 .

[50] Kyumin Lee,et al. The social honeypot project: protecting online communities from spammers , 2010, WWW '10.

[51] Lei Chen,et al. Whom to Ask? Jury Selection for Decision Making Tasks on Micro-blog Services , 2012, Proc. VLDB Endow..

[52] Karl Aberer,et al. An Evaluation of Aggregation Techniques in Crowdsourcing , 2013, WISE.

[53] Qiang Yang,et al. Cross-task crowdsourcing , 2013, KDD.

[54] Dacheng Tao,et al. Active Learning for Crowdsourcing Using Knowledge Transfer , 2014, AAAI.

[55] Beng Chin Ooi,et al. CDAS: A Crowdsourcing Data Analytics System , 2012, Proc. VLDB Endow..

[56] Milos Hauskrecht,et al. Learning classification models from multiple experts , 2013, J. Biomed. Informatics.

[57] Chien-Ju Ho,et al. Online Task Assignment in Crowdsourcing Markets , 2012, AAAI.

[58] Sudhir Agarwal,et al. Managing Quality of Human-Based eServices , 2008, ICSOC Workshops.

[59] Yong Hu,et al. The application of data mining techniques in financial fraud detection: A classification framework and an academic review of literature , 2011, Decis. Support Syst..

[60] Zoran Obradovic,et al. Learning by aggregating experts and filtering novices: a solution to crowdsourcing problems in bioinformatics , 2013, BMC Bioinformatics.

[61] Mihai Georgescu,et al. Aggregation of Crowdsourced Labels Based on Worker History , 2014, WIMS '14.

[62] Gang Hua,et al. A Joint Gaussian Process Model for Active Visual Recognition with Expertise Estimation in Crowdsourcing , 2013, 2013 IEEE International Conference on Computer Vision.

[63] Robert P. Schumaker. Machine learning the harness track: Crowdsourcing and varying race history , 2013, Decis. Support Syst..

[64] Matthew Lease,et al. Improving Quality of Crowdsourced Labels via Probabilistic Matrix Factorization , 2012, HCOMP@AAAI.

[65] Pietro Perona,et al. Online crowdsourcing: Rating annotators and obtaining cost-effective labels , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[66] Gang Chen,et al. An online cost sensitive decision-making method in crowdsourcing systems , 2013, SIGMOD '13.

[67] Tim Kraska,et al. CrowdDB: answering queries with crowdsourcing , 2011, SIGMOD '11.

[68] Shrikanth S. Narayanan,et al. A Globally-Variant Locally-Constant Model for Fusion of Labels from Multiple Diverse Experts without Using Reference Labels , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[69] Julia Kotlarsky,et al. Primary vendor capabilities in a mediated outsourcing model: Can IT service providers leverage crowdsourcing? , 2014, Decis. Support Syst..

[70] H. Hirsh,et al. Approximating the Wisdom of the Crowd , 2011 .

[71] Jian Peng,et al. Variational Inference for Crowdsourcing , 2012, NIPS.

[72] Bernardete Ribeiro,et al. Sequence labeling with multiple annotators , 2013, Machine Learning.

[73] Hongzhi Wang,et al. Brief survey of crowdsourcing for data mining , 2014, Expert Syst. Appl..

[74] Jinbo Bi,et al. Learning classifiers from dual annotation ambiguity via a min-max framework , 2015, Neurocomputing.

[75] Claudia Eckert,et al. Learning from Multiple Observers with Unknown Expertise , 2013, PAKDD.

[76] Lei Duan,et al. Separate or joint? Estimation of multiple labels from crowdsourced annotations , 2014, Expert Syst. Appl..

[77] John C. Platt,et al. Learning from the Wisdom of Crowds by Minimax Entropy , 2012, NIPS.

[78] Mohammad Rahmati,et al. Agreement/disagreement based crowd labeling , 2014, Applied Intelligence.