Mining streaming tweets for real-time event credibility prediction in Twitter

Social media like Twitter has been widely adopted for information dissemination due to its convenience and efficiency. However, false information and rumors on social media are undermining its utility as a valuable real-time information source. Existing works for information credibility analysis are based on offline batch analysis, often incurring a long lag since the event first occurs. In this paper, we develop a generative probabilistic model for real-time event credibility prediction in Twitter. We propose an online prediction algorithm based on streaming tweets, without storing or reprocessing the past tweets. We evaluate both the offline batch prediction and online streaming prediction performance of the proposed model on the Twitter dataset. The empirical results show that its batch prediction performance outperforms other algorithms based on aggregation analysis, and the online prediction performance quickly approaches that of the batch prediction with only a few hundred tweets.

[1]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[2]  Kyomin Jung,et al.  Prominent Features of Rumor Propagation in Online Social Media , 2013, 2013 IEEE 13th International Conference on Data Mining.

[3]  Anupam Joshi,et al.  Faking Sandy: characterizing and identifying fake images on Twitter during Hurricane Sandy , 2013, WWW.

[4]  Ponnurangam Kumaraguru,et al.  Credibility ranking of tweets during high impact events , 2012, PSOSM '12.

[5]  Svitlana Volkova,et al.  Inferring User Political Preferences from Streaming Communications , 2014, ACL.

[6]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[7]  Ponnurangam Kumaraguru,et al.  TweetCred: Real-Time Credibility Assessment of Content on Twitter , 2014, SocInfo.

[8]  Barbara Poblete,et al.  Information credibility on twitter , 2011, WWW.

[9]  Barbara Poblete,et al.  Twitter under crisis: can we trust what we RT? , 2010, SOMA '10.

[10]  Miles Osborne,et al.  Streaming First Story Detection with application to Twitter , 2010, NAACL.

[11]  Stefan Poslad,et al.  Identifying relevant event content for real-time event detection , 2014, 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014).

[12]  Benjamin Van Durme Streaming Analysis of Discourse Participants , 2012, EMNLP-CoNLL.

[13]  Alon Lavie,et al.  Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, EMNLP-CoNLL 2012, July 12-14, 2012, Jeju Island, Korea , 2012 .