On robust truth discovery in sparse social media sensing

In the big data era, it's important to identify trustworthy information from an influx of noisy data contributed by unvetted sources from online social media (e.g., Twitter, Instagram, Flickr). This task is referred to as truth discovery which aims at identifying the reliability of the sources and the truthfulness of claims they make without knowing either of them a priori. There are two important challenges that have not been well addressed in current truth discovery solutions. The first one is “misinformation spread” where a majority of sources are contributing to false claims, making the identification of truthful claims difficult. The second challenge is “data sparsity” where sources contribute a small number of claims, providing insufficient evidence to accomplish the truth discovery task. In this paper, we developed a Robust Truth Discovery (RTD) scheme to address the above two challenges. In particular, the RTD scheme explicitly quantifies different degrees of attitude that a source may express on a claim and incorporates the historical contributions of a source using a principled approach. The evaluation results on two real world datasetsshow that the RTD scheme significantly outperforms the state-of-the-art truth discovery methods.

[1]  Dragomir R. Radev,et al.  Rumor has it: Identifying Misinformation in Microblogs , 2011, EMNLP.

[2]  Dan Roth,et al.  Knowing What to Believe (when you already know something) , 2010, COLING.

[3]  Yuping Zhao,et al.  A Novel Fast Anti-Collision Algorithm for RFID Systems , 2007, 2007 International Conference on Wireless Communications, Networking and Mobile Computing.

[4]  Lance Kaplan,et al.  On truth discovery in social sensing: A maximum likelihood estimation approach , 2012, 2012 ACM/IEEE 11th International Conference on Information Processing in Sensor Networks (IPSN).

[5]  Zhao Yuping A Novel Anti-Collision Protocol in Multiple Readers RFID Sensor Networks , 2008 .

[6]  Dan Roth,et al.  Provenance-Assisted Classification in Social Networks , 2014, IEEE Journal of Selected Topics in Signal Processing.

[7]  Serge Abiteboul,et al.  Corroborating information from disagreeing views , 2010, WSDM '10.

[8]  Tarek F. Abdelzaher,et al.  Surrogate mobile sensing , 2014, IEEE Communications Magazine.

[9]  Chao Huang,et al.  Confidence-aware truth estimation in social sensing applications , 2015, 2015 12th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON).

[10]  Tarek F. Abdelzaher,et al.  Maximum likelihood analysis of conflicting observations in social sensing , 2014, TOSN.

[11]  Shiguang Wang,et al.  Towards Cyber-Physical Systems in Social Spaces: The Data Reliability Challenge , 2014, 2014 IEEE Real-Time Systems Symposium.

[12]  Dong Wang,et al.  Towards Emotional-Aware Truth Discovery in Social Sensing Applications , 2016, 2016 IEEE International Conference on Smart Computing (SMARTCOMP).

[13]  Tetsuro Takahashi,et al.  Rumor detection on twitter , 2012, The 6th International Conference on Soft Computing and Intelligent Systems, and The 13th International Symposium on Advanced Intelligence Systems.

[14]  Dong Wang,et al.  Mood-Sensitive Truth Discovery For Reliable Recommendation Systems in Social Sensing , 2016, RecSys.

[15]  Chao Huang,et al.  Unsupervised Interesting Places Discovery in Location-Based Social Sensing , 2016, 2016 International Conference on Distributed Computing in Sensor Systems (DCOSS).

[16]  Dong Wang,et al.  Hardness-Aware Truth Discovery in Social Sensing Applications , 2016, 2016 International Conference on Distributed Computing in Sensor Systems (DCOSS).

[17]  Nitesh V. Chawla,et al.  Towards Time-Sensitive Truth Discovery in Social Sensing Applications , 2015, 2015 IEEE 12th International Conference on Mobile Ad Hoc and Sensor Systems.

[18]  Chao Huang,et al.  Topic-Aware Social Sensing with Arbitrary Source Dependency Graphs , 2016, 2016 15th ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN).

[19]  Kate Starbird,et al.  Keeping Up with the Tweet-dashians: The Impact of 'Official' Accounts on Online Rumoring , 2016, CSCW.

[20]  Divesh Srivastava,et al.  Integrating Conflicting Data: The Role of Source Dependence , 2009, Proc. VLDB Endow..

[21]  Philip S. Yu,et al.  Truth Discovery with Multiple Conflicting Information Providers on the Web , 2007, IEEE Transactions on Knowledge and Data Engineering.

[22]  Bo Zhao,et al.  A Confidence-Aware Approach for Truth Discovery on Long-Tail Data , 2014, Proc. VLDB Endow..

[23]  Md. Yusuf Sarwar Uddin,et al.  On diversifying source selection in social sensing , 2012, 2012 Ninth International Conference on Networked Sensing (INSS).

[24]  Fenglong Ma,et al.  Towards Confidence in the Truth: A Bootstrapping based Truth Discovery Approach , 2016, KDD.

[25]  Charu C. Aggarwal,et al.  Managing and Mining Sensor Data , 2013, Springer US.

[26]  Chao Huang,et al.  Theme-Relevant Truth Discovery on Twitter: An Estimation Theoretic Approach , 2016, ICWSM.