Spatial-Temporal Aware Truth Finding in Big Data Social Sensing Applications

This paper presents a spatial-temporal aware analytical framework to solve the truth finding problem in social sensing applications. Social sensing has emerged as a new big data application paradigm of collecting observations about the physical environment from social sensors (e.g., humans) or devices on their behalf. The collected observations may be true or false, and hence are viewed as binary claims. A fundamental challenge in social sensing applications lies in accurately ascertaining the correctness of claims and the reliability of data sources without knowing either of them a priori. This challenge is referred to as truth finding. Significant efforts have been made to address this challenge but two important features were largely missing in the state-of-the-arts solutions: when and where the claims are reported by a source. In this paper, we develop a new spatial-temporal aware truth finding scheme to explicitly incorporate the time information of a claim and location information of a source into a rigorous analytical framework. The new truth finding scheme solves a constraint optimization problem to determine both the source reliability and claim correctness. We evaluated the spatial-temporal aware truth finding scheme through both an extensive simulation study and a real world case study using Twitter data feeds. The evaluation results show that our new scheme outperforms all the compared state-of-the-art baselines and significantly improves the truth finding accuracy in social sensing applications.

[1]  Deborah Estrin,et al.  Examining micro-payments for participatory sensing data collections , 2010, UbiComp.

[2]  Vana Kalogeraki,et al.  Privacy preservation for participatory sensing data , 2013, 2013 IEEE International Conference on Pervasive Computing and Communications (PerCom).

[3]  Hengchang Liu,et al.  Exploitation of Physical Constraints for Reliable Social Sensing , 2013, 2013 IEEE 34th Real-Time Systems Symposium.

[4]  Charu C. Aggarwal,et al.  On Credibility Estimation Tradeoffs in Assured Social Sensing , 2013, IEEE Journal on Selected Areas in Communications.

[5]  Vipin Kumar,et al.  Introduction to Data Mining , 2022, Data Mining and Machine Learning Applications.

[6]  Wilfred Ng,et al.  Truth Discovery in Data Streams: A Single-Pass Probabilistic Approach , 2014, CIKM.

[7]  Roberto López-Valcarce,et al.  A Diffusion-Based EM Algorithm for Distributed Estimation in Unreliable Sensor Networks , 2013, IEEE Signal Processing Letters.

[8]  Charu C. Aggarwal,et al.  Using humans as sensors: An estimation-theoretic perspective , 2014, IPSN-14 Proceedings of the 13th International Symposium on Information Processing in Sensor Networks.

[9]  Yu Hen Hu,et al.  Maximum likelihood multiple-source localization using acoustic energy measurements with wireless sensor networks , 2005, IEEE Transactions on Signal Processing.

[10]  New York Dover,et al.  ON THE CONVERGENCE PROPERTIES OF THE EM ALGORITHM , 1983 .

[11]  Jian Pei,et al.  Data Mining: Concepts and Techniques, 3rd edition , 2006 .

[12]  Charu C. Aggarwal,et al.  Recursive Fact-Finding: A Streaming Approach to Truth Estimation in Crowdsourcing Applications , 2013, 2013 IEEE 33rd International Conference on Distributed Computing Systems.

[13]  Charu C. Aggarwal,et al.  On scalability and robustness limitations of real and asymptotic confidence bounds in social sensing , 2012, 2012 9th Annual IEEE Communications Society Conference on Sensor, Mesh and Ad Hoc Communications and Networks (SECON).

[14]  Georgios B. Giannakis,et al.  Sensor-Centric Data Reduction for Estimation With WSNs via Censoring and Quantization , 2012, IEEE Transactions on Signal Processing.

[15]  Xiaoxin Yin,et al.  Semi-supervised truth discovery , 2011, WWW.

[16]  Tarek Abdelzaher,et al.  Recursive fact-finding , 2015 .

[17]  Shivakant Mishra,et al.  CenWits: a sensor-based loosely coupled search and rescue system using witnesses , 2005, SenSys '05.

[18]  Dong Wang,et al.  Social Sensing: Building Reliable Systems on Unreliable Data , 2015 .

[19]  Emiliano Miluzzo,et al.  The BikeNet mobile sensing system for cyclist experience mapping , 2007, SenSys '07.

[20]  Dan Roth,et al.  Latent credibility analysis , 2013, WWW.

[21]  Dong Wang,et al.  Analytic Challenges in Social Sensing , 2014 .

[22]  Tarek F. Abdelzaher,et al.  Maximum likelihood analysis of conflicting observations in social sensing , 2014, TOSN.

[23]  Philip S. Yu,et al.  Truth Discovery with Multiple Conflicting Information Providers on the Web , 2008, IEEE Trans. Knowl. Data Eng..

[24]  Dan Roth,et al.  Knowing What to Believe (when you already know something) , 2010, COLING.

[25]  Lance Kaplan,et al.  On truth discovery in social sensing: A maximum likelihood estimation approach , 2012, 2012 ACM/IEEE 11th International Conference on Information Processing in Sensor Networks (IPSN).

[26]  Charu C. Aggarwal,et al.  On Quantifying the Accuracy of Maximum Likelihood Estimation of Participant Reliability in Social Sensing , 2011 .

[27]  Suman Nath,et al.  ACE: Exploiting Correlation for Energy-Efficient and Continuous Context Sensing , 2012, IEEE Transactions on Mobile Computing.

[28]  Tarek F. Abdelzaher,et al.  Surrogate mobile sensing , 2014, IEEE Communications Magazine.

[29]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[30]  Jon Kleinberg,et al.  Authoritative sources in a hyperlinked environment , 1999, SODA '98.