Topic-Aware Social Sensing with Arbitrary Source Dependency Graphs

This work is motivated by the emergence of social sensing as a new paradigm of collecting observations about the physical environment from humans or devices on their behalf. These observations may be true or false, and hence are viewed as binary claims. A fundamental problem in social sensing applications lies in ascertaining the correctness of claims and the reliability of data sources without knowing either of them a priori. We refer to this problem as truth discovery. Prior works have made significant progress to addressing the truth discovery problem, but two significant limitations exist: (i) they ignored the fact that claims reported in social sensing applications can be either relevant or irrelevant to the topic of interests. (ii) They either assumed the data sources to be independent or the source dependency graphs can be represented as a set of disjoint trees. These limitations led to suboptimal truth discovery results. In contrast, this paper presents the first social sensing framework that explicitly incorporates the topic relevance feature of claims and arbitrary source dependency graphs into the solutions of truth discovery problem. The new framework solves a multidimensional maximum likelihood estimation problem to jointly estimate the truthfulness and topic relevance of claims as well as the reliability and topic awareness of sources. We compared our new scheme with the state-of-the-art truth discovery solutions using three real world data traces collected from Twitter in the aftermath of Paris Shooting event (2015), Hurricane Arthur (2014) and Boston Bombing event (2013) respectively. The evaluation results showed that our schemes significantly outperform the compared baselines by identifying more relevant and truthful claims in the truth discovery results.

[1]  Charu C. Aggarwal,et al.  Social Sensing , 2013, Managing and Mining Sensor Data.

[2]  J. Wooders,et al.  Reputation in Auctions: Theory, and Evidence from Ebay , 2006 .

[3]  Charu C. Aggarwal,et al.  Mining collective intelligence in diverse groups , 2013, WWW.

[4]  Yuping Zhao,et al.  A Novel Fast Anti-Collision Algorithm for RFID Systems , 2007, 2007 International Conference on Wireless Communications, Networking and Mobile Computing.

[5]  Xiaoxin Yin,et al.  Semi-supervised truth discovery , 2011, WWW.

[6]  Bryce Glass,et al.  Building Web Reputation Systems , 2010 .

[7]  Chao Huang,et al.  Confidence-aware truth estimation in social sensing applications , 2015, 2015 12th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON).

[8]  Dan Roth,et al.  Generalized fact-finding , 2011, WWW.

[9]  Tarek F. Abdelzaher,et al.  On truth discovery in social sensing: A maximum likelihood estimation approach , 2012, International Symposium on Information Processing in Sensor Networks.

[10]  Hengchang Liu,et al.  Exploitation of Physical Constraints for Reliable Social Sensing , 2013, 2013 IEEE 34th Real-Time Systems Symposium.

[11]  Luís M. B. Cabral,et al.  The Dynamics of Seller Reputation: Evidence from Ebay , 2006 .

[12]  Julita Vassileva,et al.  A Review on Trust and Reputation for Web Service Selection , 2007, 27th International Conference on Distributed Computing Systems Workshops (ICDCSW'07).

[13]  Roberto López-Valcarce,et al.  A Diffusion-Based EM Algorithm for Distributed Estimation in Unreliable Sensor Networks , 2013, IEEE Signal Processing Letters.

[14]  Charu C. Aggarwal,et al.  Recursive Fact-Finding: A Streaming Approach to Truth Estimation in Crowdsourcing Applications , 2013, 2013 IEEE 33rd International Conference on Distributed Computing Systems.

[15]  Md. Yusuf Sarwar Uddin,et al.  On diversifying source selection in social sensing , 2012, 2012 Ninth International Conference on Networked Sensing (INSS).

[16]  Mani B. Srivastava,et al.  Debiasing crowdsourced quantitative characteristics in local businesses and services , 2015, IPSN.

[17]  Zhao Yuping A Novel Anti-Collision Protocol in Multiple Readers RFID Sensor Networks , 2008 .

[18]  Anatole Gershman,et al.  Topical Clustering of Tweets , 2011 .

[19]  Dong Wang,et al.  Analytic Challenges in Social Sensing , 2014 .

[20]  Murat Sensoy,et al.  Trust estimation and fusion of uncertain information by exploiting consistency , 2014, 17th International Conference on Information Fusion (FUSION).

[21]  Charu C. Aggarwal,et al.  Using humans as sensors: An estimation-theoretic perspective , 2014, IPSN-14 Proceedings of the 13th International Symposium on Information Processing in Sensor Networks.

[22]  Jie Liu,et al.  Local business ambience characterization through mobile audio sensing , 2014, WWW.

[23]  Charu C. Aggarwal,et al.  Optimizing quality-of-information in cost-sensitive sensor data fusion , 2011, 2011 International Conference on Distributed Computing in Sensor Systems and Workshops (DCOSS).

[24]  Jie Gao,et al.  Predicting group stability in online social networks , 2013, WWW.

[25]  Georgios B. Giannakis,et al.  Sensor-Centric Data Reduction for Estimation With WSNs via Censoring and Quantization , 2012, IEEE Transactions on Signal Processing.

[26]  Chao Huang,et al.  Spatial-Temporal Aware Truth Finding in Big Data Social Sensing Applications , 2015, 2015 IEEE Trustcom/BigDataSE/ISPA.

[27]  Shen Li,et al.  Scalable social sensing of interdependent phenomena , 2015, IPSN.

[28]  Dong Wang,et al.  Social Sensing: Building Reliable Systems on Unreliable Data , 2015 .

[29]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[30]  Pei Zhang,et al.  The Cloud Meets the Crowd: Framework for Distributed Cloud Sensing , 2011 .

[31]  Charu C. Aggarwal,et al.  On scalability and robustness limitations of real and asymptotic confidence bounds in social sensing , 2012, 2012 9th Annual IEEE Communications Society Conference on Sensor, Mesh and Ad Hoc Communications and Networks (SECON).

[32]  Philip S. Yu,et al.  Truth Discovery with Multiple Conflicting Information Providers on the Web , 2007, IEEE Transactions on Knowledge and Data Engineering.

[33]  Taylor Cassidy,et al.  The Wisdom of Minority: Unsupervised Slot Filling Validation based on Multi-dimensional Truth-Finding , 2014, COLING.

[34]  Jiawei Han,et al.  Heterogeneous network-based trust analysis: a survey , 2011, SKDD.

[35]  Hsia-Ching Chang,et al.  A new perspective on Twitter hashtag use: Diffusion of innovation theory , 2010, ASIST.

[36]  Yu Hen Hu,et al.  Maximum likelihood multiple-source localization using acoustic energy measurements with wireless sensor networks , 2005, IEEE Transactions on Signal Processing.

[37]  Charu C. Aggarwal,et al.  On Bayesian interpretation of fact-finding in information networks , 2011, 14th International Conference on Information Fusion.

[38]  Chao Huang,et al.  Time-Aware Truth Discovery in Social Sensing , 2015, 2015 IEEE 12th International Conference on Mobile Ad Hoc and Sensor Systems.

[39]  Bo Zhao,et al.  A Bayesian Approach to Discovering Truth from Conflicting Sources for Data Integration , 2012, Proc. VLDB Endow..

[40]  Wen Hu,et al.  On the need for a reputation system in mobile phone based sensing , 2014, Ad Hoc Networks.

[41]  Wen Hu,et al.  Are you contributing trustworthy data?: the case for a reputation system in participatory sensing , 2010, MSWIM '10.