Content-based features predict social media influence operations

Coordinated political influence operations leave a distinct signature in content that machine learning can detect. We study how easy it is to distinguish influence operations from organic social media activity by assessing the performance of a platform-agnostic machine learning approach. Our method uses public activity to detect content that is part of coordinated influence operations based on human-interpretable features derived solely from content. We test this method on publicly available Twitter data on Chinese, Russian, and Venezuelan troll activity targeting the United States, as well as the Reddit dataset of Russian influence efforts. To assess how well content-based features distinguish these influence operations from random samples of general and political American users, we train and test classifiers on a monthly basis for each campaign across five prediction tasks. Content-based features perform well across period, country, platform, and prediction task. Industrialized production of influence campaign content leaves a distinctive signal in user-generated content that allows tracking of campaigns from month to month and across different accounts.

[1]  Cody Buntain,et al.  Cross-Platform State Propaganda: Russian Trolls on Twitter and YouTube during the 2016 U.S. Presidential Election , 2020, The International Journal of Press/Politics.

[2]  Mario Cataldi,et al.  Emerging topic detection on Twitter based on temporal and social terms evaluation , 2010, MDMKDD '10.

[3]  Adrienne Massanari,et al.  #Gamergate and The Fappening: How Reddit’s algorithm, governance, and culture support toxic technocultures , 2017, New Media Soc..

[4]  Kristina Lerman,et al.  Analyzing the Digital Traces of Political Manipulation: The 2016 Russian Interference Twitter Campaign , 2018, 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[5]  Filippo Menczer,et al.  Online Human-Bot Interactions: Detection, Estimation, and Characterization , 2017, ICWSM.

[6]  Deen Freelon,et al.  Black Trolls Matter: Racial and Ideological Asymmetries in Social Media Disinformation , 2020, Social Science Computer Review.

[7]  J. Craig Jenkins,et al.  Grassroots for Hire: Public Affairs Consultants in American Democracy , 2017 .

[8]  Darren L. Linvill,et al.  Troll Factories: Manufacturing Specialized Disinformation on Twitter , 2020 .

[9]  Rediet Abebe Can Cascades be Predicted? , 2014 .

[10]  Richard Bonneau,et al.  Detecting Bots on Russian Political Twitter , 2017, Big Data.

[11]  David A. Broniatowski,et al.  Weaponized Health Communication: Twitter Bots and Russian Trolls Amplify the Vaccine Debate , 2018, American journal of public health.

[12]  Deen Freelon,et al.  Assessing the Russian Internet Research Agency’s impact on the political attitudes and behaviors of American Twitter users in late 2017 , 2019, Proceedings of the National Academy of Sciences.

[13]  Kristina Lerman,et al.  Who Falls for Online Political Manipulation? , 2018, WWW.

[14]  Davide Buscaldi,et al.  TexTrolls: Identifying Russian Trolls on Twitter from a Textual Perspective , 2019, ArXiv.

[15]  Chris Wells,et al.  Disinformation, performed: self-presentation of a Russian IRA account on Twitter , 2019, Information, Communication & Society.

[16]  Tjerk P. Straatsma,et al.  NWChem: A comprehensive and scalable open-source solution for large scale molecular simulations , 2010, Comput. Phys. Commun..

[17]  Jacob N. Shapiro,et al.  Trends in Online Foreign Influence Efforts ∗ , 2019 .

[18]  Brooke Foucault Welles,et al.  The Battle for #Baltimore: Networked Counterpublics and the Contested Framing of Urban Unrest , 2019 .

[19]  Josephine Lukito,et al.  Coordinating a Multi-Platform Disinformation Campaign: Internet Research Agency Activity on Three U.S. Social Media Platforms, 2015 to 2017 , 2020, Political Communication.

[20]  Richard Bonneau,et al.  Who Leads? Who Follows? Measuring Issue Attention and Agenda Setting by Legislators and the Mass Public Using Social Media Data , 2019, American Political Science Review.

[21]  Ari Rappoport,et al.  What's in a hashtag?: content based prediction of the spread of ideas in microblogging communities , 2012, WSDM '12.

[22]  JungHwan Yang,et al.  Political Astroturfing on Twitter: How to Coordinate a Disinformation Campaign , 2020, Political Communication.

[23]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[24]  Shivakant Mishra,et al.  Analyzing Labeled Cyberbullying Incidents on the Instagram Social Network , 2015, SocInfo.

[25]  Eric Gilbert,et al.  Still out there: Modeling and Identifying Russian Troll Accounts on Twitter , 2019, WebSci.

[26]  Ryan L. Boyd,et al.  Characterizing the Internet Research Agency’s Social Media Operations During the 2016 U.S. Presidential Election using Linguistic Analyses , 2018 .

[27]  G. Raskutti,et al.  The Stealth Media? Groups and Targets behind Divisive Issue Campaigns on Facebook , 2018, Political Communication.

[28]  Dean Eckles,et al.  Protecting elections from social media manipulation , 2019, Science.

[29]  Filippo Menczer,et al.  Early detection of promoted campaigns on social media , 2017, EPJ Data Science.

[30]  Zizi Papacharissi Affective publics and structures of storytelling: sentiment, events and mediality , 2016 .