Debiasing crowdsourced quantitative characteristics in local businesses and services

Information about quantitative characteristics in local businesses and services, such as the number of people waiting in line in a cafe and the number of available fitness machines in a gym, is important for informed decision, crowd management and event detection. In this paper, we investigate the potential of leveraging crowds as sensors to report such quantitative characteristics and investigate how to recover the true quantity values from noisy crowdsourced information. Through experiments, we find that crowd sensors have both bias and variance in quantity sensing, and task difficulties impact the sensing accuracy. Based on these findings, we propose an unsupervised probabilistic model to jointly assess task difficulties, ability of crowd sensors and true quantity values. Our model differs from existing categorical truth finding models as ours is specifically designed to tackle quantitative truth. In addition to devising an efficient model inference algorithm in a batch mode, we also design an even faster online version for handling streaming data. Experimental results in various scenarios demonstrate the effectiveness of our model.

[1]  Charu C. Aggarwal,et al.  Using humans as sensors: An estimation-theoretic perspective , 2014, IPSN-14 Proceedings of the 13th International Symposium on Information Processing in Sensor Networks.

[2]  Seth J. Teller,et al.  Growing an organic indoor location system , 2010, MobiSys '10.

[3]  Emiliano Miluzzo,et al.  People-centric urban sensing , 2006, WICON '06.

[4]  Alexander I. Rudnicky,et al.  Using the Amazon Mechanical Turk for transcription of spoken language , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[5]  M. Hansen,et al.  Participatory Sensing , 2019, Internet of Things.

[6]  Dan Roth,et al.  Knowing What to Believe (when you already know something) , 2010, COLING.

[7]  Javier R. Movellan,et al.  Whose Vote Should Count More: Optimal Integration of Labels from Labelers of Unknown Expertise , 2009, NIPS.

[8]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[9]  Moustafa Youssef,et al.  No need to war-drive: unsupervised indoor localization , 2012, MobiSys '12.

[10]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[11]  Chin-Tau A. Lea,et al.  Received Signal Strength-Based Wireless Localization via Semidefinite Programming: Noncooperative and Cooperative Schemes , 2010, IEEE Transactions on Vehicular Technology.

[12]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[13]  Romit Roy Choudhury,et al.  If you see something, swipe towards it: crowdsourced event localization using smartphones , 2013, UbiComp.

[14]  Michael S. Bernstein,et al.  Soylent: a word processor with a crowd inside , 2010, UIST.

[15]  Benjamin B. Bederson,et al.  Human computation: a survey and taxonomy of a growing field , 2011, CHI.

[16]  Charu C. Aggarwal,et al.  On Credibility Estimation Tradeoffs in Assured Social Sensing , 2013, IEEE Journal on Selected Areas in Communications.

[17]  S. Barnett,et al.  Philosophical Transactions of the Royal Society A : Mathematical , 2017 .

[18]  Tarek F. Abdelzaher,et al.  On truth discovery in social sensing: A maximum likelihood estimation approach , 2012, International Symposium on Information Processing in Sensor Networks.

[19]  Pietro Perona,et al.  The Multidimensional Wisdom of Crowds , 2010, NIPS.

[20]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[21]  Mani Srivastava,et al.  Human-centric sensing , 2012, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[22]  Mirco Musolesi,et al.  Urban sensing systems: opportunistic or participatory? , 2008, HotMobile '08.

[23]  Ramachandran Ramjee,et al.  Nericell: rich monitoring of road and traffic conditions using mobile smartphones , 2008, SenSys '08.

[24]  Mani B. Srivastava,et al.  Truth Discovery in Crowdsourced Detection of Spatial Events , 2016, IEEE Trans. Knowl. Data Eng..

[25]  Gerardo Hermosillo,et al.  Learning From Crowds , 2010, J. Mach. Learn. Res..

[26]  Deborah Estrin,et al.  Recruitment Framework for Participatory Sensing Data Collections , 2010, Pervasive.

[27]  Philip S. Yu,et al.  Truth Discovery with Multiple Conflicting Information Providers on the Web , 2007, IEEE Transactions on Knowledge and Data Engineering.

[28]  Ramachandran Ramjee,et al.  Nericell: using mobile smartphones for rich monitoring of road and traffic conditions , 2008, SenSys '08.

[29]  Biswanath Mukherjee,et al.  Wireless sensor network survey , 2008, Comput. Networks.

[30]  Jie Liu,et al.  Local business ambience characterization through mobile audio sensing , 2014, WWW.

[31]  A. P. Dawid,et al.  Maximum Likelihood Estimation of Observer Error‐Rates Using the EM Algorithm , 1979 .

[32]  Bo Zhao,et al.  A Bayesian Approach to Discovering Truth from Conflicting Sources for Data Integration , 2012, Proc. VLDB Endow..

[33]  Richard P. Martin,et al.  Tracking human queues using single-point signal monitoring , 2014, MobiSys.