BotPercent: Estimating Twitter Bot Populations from Groups to Crowds

Twitter bot detection has become increasingly important in combating misinformation, identifying malicious online campaigns, and protecting the integrity of social media discourse. While existing bot detection literature mostly focuses on identifying individual bots, it remains underexplored how to estimate the proportion of bots within specific communities and social networks, which has great implications for both content moderators and day-to-day users. In this work, we propose community-level bot detection, a novel approach to estimating the amount of malicious interference in online communities by estimating the percentage of bot accounts. Specifically, we introduce BotPercent, an amalgamation of Twitter bot-detection datasets and feature-, text-, and graph-based models that overcome generalization issues in existing individual-level models, resulting in a more accurate community-level bot estimation. Experiments demonstrate that BotPercent achieves state-of-the-art community-level bot detection performance on the TwiBot-22 benchmark while showing great robustness towards the tampering of specific user features. Armed with BotPercent, we analyze bot rates in different Twitter groups and communities, such as all active Twitter users, users that interact with partisan news media, users that participate in Elon Musk's content moderation votes, and the political communities in different countries and regions. Our experimental results demonstrate that the existence of Twitter bots is not homogeneous, but rather a spatial-temporal distribution whose heterogeneity should be taken into account for content moderation, social media policy making, and more. The BotPercent implementation is available at https://github.com/TamSiuhin/BotPercent

[1]  Daniel M. Romero,et al.  Just Another Day on Twitter: A Complete 24 Hours of Twitter Data , 2023, Proceedings of the International AAAI Conference on Web and Social Media.

[2]  Saiph Savage,et al.  Datavoidant: An AI System for Addressing Political Data Voids on Social Media , 2022, Proc. ACM Hum. Comput. Interact..

[3]  Onur Varol Should we agree to disagree about Twitter's bot problem? , 2022, ArXiv.

[4]  Shangbin Feng,et al.  BIC: Twitter Bot Detection with Text-Graph Interaction and Semantic Consistency , 2022 .

[5]  H. Alashwal,et al.  Bot-MGAT: A Transfer Learning Model Based on a Multi-View Graph Attention Network to Detect Social Bots , 2022, Applied Sciences.

[6]  K. Carley,et al.  BotBuster: Multi-platform Bot Detection Using A Mixture of Experts , 2022, ICWSM.

[7]  Huailiang Peng,et al.  Domain-Aware Federated Social Bot Detection with Multi-Relational Graph Neural Networks , 2022, 2022 International Joint Conference on Neural Networks (IJCNN).

[8]  Haiyong Xie,et al.  RoSGAS: Adaptive Social Bot Detection with Reinforced Self-supervised GNN Architecture Search , 2022, ACM Trans. Web.

[9]  P. Ho,et al.  DeeProBot: a hybrid deep neural network model for social bot detection based on user profile data , 2022, Social Network Analysis and Mining.

[10]  K. Carley,et al.  Stabilizing a supervised bot detection algorithm: How much data is needed for consistent predictions? , 2022, Online Soc. Networks Media.

[11]  Kai-Cheng Yang,et al.  Botometer 101: social bot practicum for computational social scientists , 2022, Journal of Computational Social Science.

[12]  Haiyong Xie,et al.  Social Bots Detection via Fusing BERT and Graph Convolutional Networks , 2021, Symmetry.

[13]  Yizhou Sun,et al.  Graph-less Neural Networks: Teaching Old MLPs New Tricks via Distillation , 2021, ICLR.

[14]  Minnan Luo,et al.  Heterogeneity-aware Twitter Bot Detection with Relational Graph Transformers , 2021, AAAI.

[15]  Ninghao Liu,et al.  EDITS: Modeling and Mitigating Data Bias for Graph Neural Networks , 2021, WWW.

[16]  Unil Yun,et al.  Bot2Vec: A general approach of intra-community oriented representation learning for bot detection in different types of social networks , 2021, Inf. Syst..

[17]  P. Prałat,et al.  Detecting bots in social-networks using node and structural embeddings , 2023, Journal of Big Data.

[18]  Chang Zhou,et al.  Are we really making much progress?: Revisiting, benchmarking and refining heterogeneous graph neural networks , 2021, KDD.

[19]  Minnan Luo,et al.  TwiBot-20: A Comprehensive Twitter Bot Detection Benchmark , 2021, CIKM.

[20]  Minnan Luo,et al.  SATAR: A Self-supervised Approach to Twitter Account Representation Learning and its Application in Bot Detection , 2021, CIKM.

[21]  Minnan Luo,et al.  BotRGCN: Twitter bot detection with relational graph convolutional networks , 2021, ASONAM.

[22]  Ruslan Salakhutdinov,et al.  Towards Understanding and Mitigating Social Biases in Language Models , 2021, ICML.

[23]  Emily M. Bender,et al.  On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜 , 2021, FAccT.

[24]  Siva Reddy,et al.  StereoSet: Measuring stereotypical bias in pretrained language models , 2020, ACL.

[25]  W. Marcellino,et al.  Counter-Radicalization Bot Research: Using Social Bots to Fight Violent Extremism , 2020 .

[26]  David Dukić,et al.  Are You Human? Detecting Bots on Twitter Using BERT , 2020, 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA).

[27]  W. Ahmed,et al.  COVID-19 and the “Film Your Hospital” Conspiracy Theory: Social Network Analysis of Twitter Data , 2020, Journal of medical Internet research.

[28]  Fenglong Ma,et al.  DETERRENT: Knowledge Guided Graph Attention Network for Detecting Healthcare Misinformation , 2020, KDD.

[29]  A. Flammini,et al.  Detection of Novel Social Bots by Ensembles of Specialized Classifiers , 2020, CIKM.

[30]  Jun Hu,et al.  Fake News Detection via Knowledge-driven Multimodal Graph Convolutional Networks , 2020, ICMR.

[31]  Cheng-Te Li,et al.  GCAN: Graph-aware Co-Attention Networks for Explainable Fake News Detection on Social Media , 2020, ACL.

[32]  Emilio Ferrara,et al.  What types of COVID-19 conspiracies are populated by Twitter bots? , 2020, First Monday.

[33]  Yizhou Sun,et al.  Heterogeneous Graph Transformer , 2020, WWW.

[34]  Matti Rossi,et al.  Detecting Political Bots on Twitter during the 2019 Finnish Parliamentary Election , 2020, HICSS.

[35]  Filippo Menczer,et al.  Scalable and Generalizable Social Bot Detection through Data Selection , 2019, AAAI.

[36]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[37]  Guido Caldarelli,et al.  The role of bot squads in the political propaganda on Twitter , 2019, Communications Physics.

[38]  Kathleen M. Carley,et al.  Bot Impacts on Public Sentiment and Community Structures: Comparative Analysis of Three Elections in the Asia-Pacific , 2020, SBP-BRiMS.

[39]  Uyen Trang Nguyen,et al.  Twitter Bot Detection Using Bidirectional Long Short-Term Memory Neural Networks and Word Embeddings , 2019, 2019 First IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications (TPS-ISA).

[40]  Jürgen Knauth,et al.  Language-Agnostic Twitter-Bot Detection , 2019, RANLP.

[41]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[42]  Christoph Meinel,et al.  Detect Me If You Can: Spam Bot Detection Using Inductive Representation Learning , 2019, WWW.

[43]  Maurizio Tesconi,et al.  RTbust: Exploiting Temporal Patterns for Botnet Detection on Twitter , 2019, WebSci.

[44]  Filippo Menczer,et al.  Arming the public with artificial intelligence to counter social bots , 2019, Human Behavior and Emerging Technologies.

[45]  Emilio Ferrara,et al.  Bots increase exposure to negative and inflammatory content in online social systems , 2018, Proceedings of the National Academy of Sciences.

[46]  Gianluca Stringhini,et al.  LOBO: Evaluation of Generalization Deficiencies in Twitter Bot Classifiers , 2018, ACSAC.

[47]  Fabrizio Lillo,et al.  $FAKE: Evidence of Spam and Bot Activity in Stock Microblogs on Twitter , 2018, ICWSM.

[48]  Emilio Ferrara,et al.  Deep Neural Networks for Bot Detection , 2018, Inf. Sci..

[49]  Daniel Dajun Zeng,et al.  Detecting Social Bots by Jointly Modeling Deep Behavior and Content Information , 2017, CIKM.

[50]  Jon Crowcroft,et al.  Of Bots and Humans (on Twitter) , 2017, ASONAM.

[51]  Filippo Menczer,et al.  The spread of fake news by social bots , 2017, ArXiv.

[52]  Kilian Q. Weinberger,et al.  On Calibration of Modern Neural Networks , 2017, ICML.

[53]  Filippo Menczer,et al.  Online Human-Bot Interactions: Detection, Estimation, and Characterization , 2017, ICWSM.

[54]  Philip N. Howard,et al.  Junk News and Bots during the French Presidential Election: What Are French Voters Sharing Over Twitter In Round Two? COMPROP DATA MEMO 2017.4 / 4 MAY 2017 , 2017 .

[55]  Hossein Hamooni,et al.  DeBot: Twitter Bot Detection via Warped Correlation , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[56]  Emilio Ferrara,et al.  Social Bots Distort the 2016 US Presidential Election Online Discussion , 2016, First Monday.

[57]  Huan Liu,et al.  A new approach to bot detection: Striking the balance between precision and recall , 2016, 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[58]  Alessandro Flammini,et al.  Predicting online extremism, content adopters, and interaction reciprocity , 2016, SocInfo.

[59]  Samuel C. Woolley,et al.  Automating power: Social bot interference in global politics , 2016, First Monday.

[60]  Filippo Menczer,et al.  The rise of social bots , 2014, Commun. ACM.

[61]  V. S. Subrahmanian,et al.  Tutorial: Identifying Malicious Actors on Social Media , 2016, ASONAM.

[62]  Roberto Di Pietro,et al.  Fame for sale: Efficient detection of fake Twitter followers , 2015, Decis. Support Syst..

[63]  Philip N. Howard,et al.  Political Bots and the Manipulation of Public Opinion in Venezuela , 2015, ArXiv.

[64]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[65]  Patrick Weber,et al.  Discussions in the comments section: Factors influencing participation and interactivity in online newspapers’ reader comments , 2014, New Media Soc..

[66]  Garrett Murphy Education Issues Raised by S.744: The Border Security, Economic Opportunity, and Immigration Modernization Act. , 2014 .

[67]  Wei Hu,et al.  Twitter spammer detection using data stream clustering , 2014, Inf. Sci..

[68]  Rossano Schifanella,et al.  People Are Strange When You're a Stranger: Impact and Influence of Bots on Social Networks , 2012, ICWSM.

[69]  Kyumin Lee,et al.  Seven Months with the Devils: A Long-Term Study of Content Polluters on Twitter , 2011, ICWSM.

[70]  Vinod Yegneswaran,et al.  BotHunter: Detecting Malware Infection Through IDS-Driven Dialog Correlation , 2007, USENIX Security Symposium.

[71]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[72]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[73]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.