Automatically Identifying Fake News in Popular Twitter Threads

Information quality in social media is an increasingly important issue, but web-scale data hinders experts' ability to assess and correct much of the inaccurate content, or "fake news," present in these platforms. This paper develops a method for automating fake news detection on Twitter by learning to predict accuracy assessments in two credibility-focused Twitter datasets: CREDBANK, a crowdsourced dataset of accuracy assessments for events in Twitter, and PHEME, a dataset of potential rumors in Twitter and journalistic assessments of their accuracies. We apply this method to Twitter content sourced from BuzzFeed's fake news dataset and show models trained against crowdsourced workers outperform models based on journalists' assessment and models trained on a pooled dataset of both crowdsourced workers and journalists. All three datasets, aligned into a uniform format, are also publicly available. A feature analysis then identifies features that are most predictive for crowdsourced and journalistic accuracy assessments, results of which are consistent with prior work. We close with a discussion contrasting accuracy and credibility and why models of non-experts outperform models of journalists for fake news detection in Twitter.

[1]  Brian Ecker,et al.  Internet Argument Corpus 2.0: An SQL schema for Dialogic Social Media and the Corpora to go with it , 2016, LREC.

[2]  Kyomin Jung,et al.  Prominent Features of Rumor Propagation in Online Social Media , 2013, 2013 IEEE 13th International Conference on Data Mining.

[3]  James Pustejovsky,et al.  FactBank: a corpus annotated with event factuality , 2009, Lang. Resour. Evaluation.

[4]  Sam Wineburg,et al.  Evaluating information: The cornerstone of civic online reasoning , 2016 .

[5]  Barbara Poblete,et al.  Predicting information credibility in time-sensitive social media , 2013, Internet Res..

[6]  Scott R. Maier Accuracy Matters: A Cross-Market Assessment of Newspaper Error and Credibility , 2005 .

[7]  Tobias Höllerer,et al.  Modeling topic specific credibility on twitter , 2012, IUI '12.

[8]  Jacob Ratkiewicz,et al.  Detecting and Tracking the Spread of Astroturf Memes in Microblog Streams , 2010, ArXiv.

[9]  Peter Bro,et al.  Digital Gatekeeping , 2014 .

[10]  Anupam Joshi,et al.  Faking Sandy: characterizing and identifying fake images on Twitter during Hurricane Sandy , 2013, WWW.

[11]  Yan Jin,et al.  Social Media Use During Disasters , 2016, Commun. Res..

[12]  Timothy Baldwin,et al.  On-line Trend Analysis with Topic Models: #twitter Trends Detection Topic Model Online , 2012, COLING.

[13]  B. J. Fogg,et al.  The elements of computer credibility , 1999, CHI '99.

[14]  Wilson Lowrey,et al.  The Credibility Divide: Reader Trust of Online Newspapers and Blogs , 2007 .

[15]  Peter Kulchyski and , 2015 .

[16]  Dragomir R. Radev,et al.  Rumor has it: Identifying Misinformation in Microblogs , 2011, EMNLP.

[17]  Scott Counts,et al.  Tweeting is believing?: understanding microblog credibility perceptions , 2012, CSCW.

[18]  Kate Starbird,et al.  Examining the Alternative Media Ecosystem Through the Production of Alternative Narratives of Mass Shooting Events on Twitter , 2017, ICWSM.

[19]  Yeojin Kim,et al.  Who are Citizen Journalists in the Social Media Environment? , 2015 .

[20]  Cristian Danescu-Niculescu-Mizil,et al.  Winning Arguments: Interaction Dynamics and Persuasion Strategies in Good-faith Online Discussions , 2016, WWW.

[21]  Barbara Poblete,et al.  Twitter under crisis: can we trust what we RT? , 2010, SOMA '10.

[22]  Eric Gilbert,et al.  CREDBANK: A Large-Scale Social Media Corpus With Associated Credibility Annotations , 2015, ICWSM.

[23]  Ray Bull,et al.  Psychology and Law: Truthfulness, Accuracy and Credibility , 2000 .

[24]  Arkaitz Zubiaga,et al.  Analysing How People Orient to and Spread Rumours in Social Media by Looking at Conversational Threads , 2015, PloS one.

[25]  Kate Starbird,et al.  Rumors, False Flags, and Digital Vigilantes: Misinformation on Twitter after the 2013 Boston Marathon Bombing , 2014 .