论文信息 - Context-aware result inference in crowdsourcing

Context-aware result inference in crowdsourcing

Abstract Many result inference methods have been proposed to address the quality-control problem in crowdsourcing. However, existing methods are ineffective for context-sensitive tasks ( CSTs ), e.g., handwriting recognition, translation, speech transcription, where context correlation within a task cannot be ignored for two reasons. Firstly, it is ineffective to crowdsource a whole CST (e.g., recognizing handwritten texts) and use task-level inference methods to infer the answer, because it is rather hard to correctly complete a whole complicated task. Secondly, although a CST is composed of a set of atomic subtasks (e.g., recognizing a handwritten word), it is unsuitable to split it into multiple subtasks and adopt a subtask-level inference algorithm to infer the result, because this will lose the context correlation (e.g., phrases) among subtasks and increase the difficulty to complete a task. Thus it calls for a new approach to handling CSTs . In this work, we study the result inference problem for CSTs and propose a context-aware inference algorithm. We design an inference algorithm by incorporating the context information. Furthermore, we introduce an iterative method to improve the quality. The results of experiments on real-world CSTs demonstrated the superiority of our approach compared with the state-of-the-art methods.

Hailong Sun | Guoliang Li | Jin-Peng Huai | Richong Zhang | Yili Fang

[1] Jean Vanderdonckt,et al. A computational framework for context-aware adaptation of user interfaces , 2013, IEEE 7th International Conference on Research Challenges in Information Science (RCIS).

[2] Paul M. Baggenstoss. A modified Baum-Welch algorithm for hidden Markov models with multiple observation spaces , 2001, IEEE Trans. Speech Audio Process..

[3] Yoram Bachrach,et al. Hotspotting - A Probabilistic Graphical Model For Image Object Localization Through Crowdsourcing , 2013, AAAI.

[4] Hailong Sun,et al. Effective Result Inference for Context-Sensitive Tasks in Crowdsourcing , 2016, DASFAA.

[5] Manuel Blum,et al. reCAPTCHA: Human-Based Character Recognition via Web Security Measures , 2008, Science.

[6] Richard A. Robb,et al. Biomedical Imaging, Visualization, and Analysis , 1999 .

[7] Yang Du,et al. A General Fine-Grained Truth Discovery Approach for Crowdsourced Data Aggregation , 2017, DASFAA.

[8] Hailong Sun,et al. Incorporating External Knowledge into Crowd Intelligence for More Specific Knowledge Acquisition , 2016, IJCAI.

[9] Aditya G. Parameswaran,et al. Challenges in Data Crowdsourcing , 2016, IEEE Transactions on Knowledge and Data Engineering.

[10] Koby Crammer,et al. Sequence Learning from Data with Multiple Labels , 2009 .

[11] Reynold Cheng,et al. DOCS: a domain-aware crowdsourcing system using knowledge bases , 2016, VLDB 2016.

[12] Spyros Sioutas,et al. Early prediction in collective intelligence on video users' activity , 2015, Inf. Sci..

[13] Heng Ji,et al. FaitCrowd: Fine Grained Truth Discovery for Crowdsourced Data Aggregation , 2015, KDD.

[14] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[15] Reynold Cheng,et al. QASCA: A Quality-Aware Task Assignment System for Crowdsourcing Applications , 2015, SIGMOD Conference.

[16] Lydia B. Chilton,et al. TurKit: Tools for iterative tasks on mechanical turk , 2009, 2009 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC).

[17] Sarvapali D. Ramchurn,et al. Crowdsourcing Complex Workflows under Budget Constraints , 2015, AAAI.

[18] Michael S. Bernstein,et al. Soylent: a word processor with a crowd inside , 2010, UIST.

[19] Kate Starbird,et al. Delivering patients to sacré coeur: collective intelligence in digital volunteer communities , 2013, CHI.

[20] A. P. Dawid,et al. Maximum Likelihood Estimation of Observer Error‐Rates Using the EM Algorithm , 1979 .

[21] Guoliang Li,et al. Incremental Quality Inference in Crowdsourcing , 2014, DASFAA.

[22] Gregory D. Abowd,et al. A Conceptual Framework and a Toolkit for Supporting the Rapid Prototyping of Context-Aware Applications , 2001, Hum. Comput. Interact..

[23] Guoliang Li,et al. Crowdsourced Data Management: A Survey , 2016, IEEE Transactions on Knowledge and Data Engineering.

[24] Daniel Deutch,et al. On Provenance Minimization , 2012 .

[25] Horst Bunke,et al. A full English sentence database for off-line handwriting recognition , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[26] David Garlan,et al. Context is key , 2005, CACM.

[27] Alex Waibel,et al. Readings in speech recognition , 1990 .

[28] S. Eddy. Hidden Markov models. , 1996, Current opinion in structural biology.

[29] Xindong Wu,et al. Learning from crowdsourced labeled data: a survey , 2016, Artificial Intelligence Review.

[30] Panagiotis G. Ipeirotis,et al. Quizz: targeted crowdsourcing with a billion (potential) users , 2014, WWW.

[31] Yves Normandin. Maximum Mutual Information Estimation of Hidden Markov Models , 1996 .

[32] Peng Dai,et al. POMDP-based control of workflows for crowdsourcing , 2013, Artif. Intell..

[33] Horst Bunke,et al. The IAM-database: an English sentence database for offline handwriting recognition , 2002, International Journal on Document Analysis and Recognition.

[34] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[35] Victor S. Sheng,et al. Consensus algorithms for biased labeling in crowdsourcing , 2017, Inf. Sci..

[36] Anind K. Dey,et al. Understanding and Using Context , 2001, Personal and Ubiquitous Computing.

[37] Hung Keng Pung,et al. A middleware for building context-aware mobile services , 2004, 2004 IEEE 59th Vehicular Technology Conference. VTC 2004-Spring (IEEE Cat. No.04CH37514).

[38] Jian Peng,et al. Variational Inference for Crowdsourcing , 2012, NIPS.

[39] Tingting Mu,et al. Context-Aware and Energy-Driven Route Optimization for Fully Electric Vehicles via Crowdsourcing , 2013, IEEE Transactions on Intelligent Transportation Systems.

[40] Jiawei Han,et al. A probabilistic model for linking named entities in web text with heterogeneous information networks , 2014, SIGMOD Conference.

[41] Matthew Lease,et al. SQUARE: A Benchmark for Research on Computing Crowd Consensus , 2013, HCOMP.

[42] Guoliang Li,et al. Truth Inference in Crowdsourcing: Is the Problem Solved? , 2017, Proc. VLDB Endow..

[43] Beng Chin Ooi,et al. CDAS: A Crowdsourcing Data Analytics System , 2012, Proc. VLDB Endow..

[44] Hailong Sun,et al. Improving the Quality of Crowdsourced Image Labeling via Label Similarity , 2017, Journal of Computer Science and Technology.

[45] Anirban Dasgupta,et al. Crowdsourced judgement elicitation with endogenous proficiency , 2013, WWW.

[46] Maxine Eskénazi,et al. Toward better crowdsourced transcription: Transcription of a year of the Let's Go Bus Information System data , 2010, 2010 IEEE Spoken Language Technology Workshop.

[47] Panagiotis G. Ipeirotis,et al. Get another label? improving data quality and data mining using multiple, noisy labelers , 2008, KDD.

[48] Javier R. Movellan,et al. Whose Vote Should Count More: Optimal Integration of Labels from Labelers of Unknown Expertise , 2009, NIPS.

[49] Chris Callison-Burch,et al. Crowdsourcing Translation: Professional Quality from Non-Professionals , 2011, ACL.