Aggregating Crowdsourced Quantitative Claims: Additive and Multiplicative Models

Truth discovery is an important technique for enabling reliable crowdsourcing applications. It aims to automatically discover the truths from possibly conflicting crowdsourced claims. Most existing truth discovery approaches focus on categorical applications, such as image classification. They use the accuracy, i.e., rate of exactly correct claims, to capture the reliability of participants. As a consequence, they are not effective for truth discovery in quantitative applications, such as percentage annotation and object counting, where similarity rather than exact matching between crowdsourced claims and latent truths should be considered. In this paper, we propose two unsupervised Quantitative Truth Finders (QTFs) for truth discovery in quantitative crowdsourcing applications. One QTF explores an additive model and the other explores a multiplicative model to capture different relationships between crowdsourced claims and latent truths in different classes of quantitative tasks. These QTFs naturally incorporate the similarity between variables. Moreover, they use the bias and the confidence instead of the accuracy to capture participants' abilities in quantity estimation. These QTFs are thus capable of accurately discovering quantitative truths in particular domains. Through extensive experiments, we demonstrate that these QTFs outperform other state-of-the-art approaches for truth discovery in quantitative crowdsourcing applications and they are also quite efficient.

[1]  Mani B. Srivastava,et al.  Debiasing crowdsourced quantitative characteristics in local businesses and services , 2015, IPSN.

[2]  Alexander I. Rudnicky,et al.  Using the Amazon Mechanical Turk for transcription of spoken language , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[3]  E. Crow,et al.  Lognormal Distributions: Theory and Applications , 1987 .

[4]  Antoni B. Chan,et al.  Crossing the Line: Crowd Counting by Integer Programming with Local Features , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Mani B. Srivastava,et al.  Truth Discovery in Crowdsourced Detection of Spatial Events , 2014, IEEE Transactions on Knowledge and Data Engineering.

[6]  P. Bahr,et al.  Sampling: Theory and Applications , 2020, Applied and Numerical Harmonic Analysis.

[7]  Bo Zhao,et al.  A Bayesian Approach to Discovering Truth from Conflicting Sources for Data Integration , 2012, Proc. VLDB Endow..

[8]  Charu C. Aggarwal,et al.  Using humans as sensors: An estimation-theoretic perspective , 2014, IPSN-14 Proceedings of the 13th International Symposium on Information Processing in Sensor Networks.

[9]  Hisashi Kashima,et al.  Statistical quality estimation for general crowdsourcing tasks , 2013, HCOMP.

[10]  Lydia B. Chilton,et al.  Exploring iterative and parallel human computation processes , 2010, HCOMP '10.

[11]  Nuno Vasconcelos,et al.  Privacy preserving crowd monitoring: Counting people without people models or tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  P. Rousseeuw,et al.  Alternatives to the Median Absolute Deviation , 1993 .

[13]  Panagiotis G. Ipeirotis Analyzing the Amazon Mechanical Turk marketplace , 2010, XRDS.

[14]  S. Gomáriz,et al.  Fine-scale bird monitoring from light unmanned aircraft systems , 2012 .

[15]  Pietro Perona,et al.  The Multidimensional Wisdom of Crowds , 2010, NIPS.

[16]  Aniket Kittur,et al.  CrowdForge: crowdsourcing complex work , 2011, UIST.

[17]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[18]  Dan Roth,et al.  Knowing What to Believe (when you already know something) , 2010, COLING.

[19]  Lance Kaplan,et al.  On truth discovery in social sensing: A maximum likelihood estimation approach , 2012, 2012 ACM/IEEE 11th International Conference on Information Processing in Sensor Networks (IPSN).

[20]  Daniel Jackson,et al.  Occupancy monitoring using environmental & context sensors and a hierarchical analysis framework , 2014, BuildSys@SenSys.

[21]  Michael S. Bernstein,et al.  Soylent: a word processor with a crowd inside , 2010, UIST.

[22]  Charu C. Aggarwal,et al.  On Credibility Estimation Tradeoffs in Assured Social Sensing , 2013, IEEE Journal on Selected Areas in Communications.

[23]  Dan Roth,et al.  Latent credibility analysis , 2013, WWW.

[24]  Javier R. Movellan,et al.  Whose Vote Should Count More: Optimal Integration of Labels from Labelers of Unknown Expertise , 2009, NIPS.

[25]  Chris Callison-Burch,et al.  Crowdsourcing Translation: Professional Quality from Non-Professionals , 2011, ACL.

[26]  Claudia Biermann,et al.  Mathematical Methods Of Statistics , 2016 .

[27]  Bo Zhao,et al.  A Survey on Truth Discovery , 2015, SKDD.

[28]  A. P. Dawid,et al.  Maximum Likelihood Estimation of Observer Error‐Rates Using the EM Algorithm , 1979 .

[29]  Romit Roy Choudhury,et al.  If you see something, swipe towards it: crowdsourced event localization using smartphones , 2013, UbiComp.

[30]  Aniket Kittur,et al.  Crowdsourcing user studies with Mechanical Turk , 2008, CHI.

[31]  Gerardo Hermosillo,et al.  Learning From Crowds , 2010, J. Mach. Learn. Res..

[32]  Deborah Estrin,et al.  Recruitment Framework for Participatory Sensing Data Collections , 2010, Pervasive.

[33]  Philip S. Yu,et al.  Truth Discovery with Multiple Conflicting Information Providers on the Web , 2007, IEEE Transactions on Knowledge and Data Engineering.

[34]  Benjamin B. Bederson,et al.  Human computation: a survey and taxonomy of a growing field , 2011, CHI.

[35]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[36]  N. Jaspen Applied Nonparametric Statistics , 1979 .