Birdspotter: A Tool for Analyzing and Labeling Twitter Users

The impact of online social media on societal events and institutions is profound, and with the rapid increases in user uptake, we are just starting to understand its ramifications. Social scientists and practitioners who model online discourse as a proxy for real-world behavior often curate large social media datasets. A lack of available tooling aimed at non-data science experts frequently leaves this data (and the insights it holds) underutilized. Here, we propose birdspotter -- a tool to analyze and label Twitter users --, and birdspotter.ml -- an exploratory visualizer for the computed metrics. birdspotter provides an end-to-end analysis pipeline, from the processing of pre-collected Twitter data to general-purpose labeling of users and estimating their social influence, within a few lines of code. The package features tutorials and detailed documentation. We also illustrate how to train birdspotter into a fully-fledged bot detector that achieves better than state-of-the-art performances without making Twitter API calls, and we showcase its usage in an exploratory analysis of a topical COVID-19 dataset.

[1]  Emilio Ferrara,et al.  #COVID-19 on Twitter: Bots, Conspiracies, and Social Media Activism , 2020, ArXiv.

[2]  Aaron Smith,et al.  Bots in the Twittersphere , 2018 .

[3]  Emilio Ferrara,et al.  Social Bots Distort the 2016 US Presidential Election Online Discussion , 2016, First Monday.

[4]  Ben Y. Zhao,et al.  Uncovering social network sybils in the wild , 2011, IMC '11.

[5]  Filippo Menczer,et al.  The rise of social bots , 2014, Commun. ACM.

[6]  Sushil Jajodia,et al.  Blog or block: Detecting blog bots through behavioral biometrics , 2013, Comput. Networks.

[7]  Alessandro Flammini,et al.  Detection of Novel Social Bots by Ensembles of Specialized Classifiers , 2020, CIKM.

[8]  Kristina Lerman,et al.  Tracking Social Media Discourse About the COVID-19 Pandemic: Development of a Public Coronavirus Twitter Data Set , 2020, JMIR public health and surveillance.

[9]  Emilio Ferrara,et al.  Deep Neural Networks for Bot Detection , 2018, Inf. Sci..

[10]  Yifei Zhang,et al.  #DebateNight: The Role and Influence of Socialbots on Twitter During the 1st 2016 U.S. Presidential Debate , 2018, ICWSM.

[11]  Cecile Paris,et al.  Learning Influence Probabilities and Modelling Influence Diffusion in Twitter , 2019, WWW.

[12]  Christopher M. Danforth,et al.  Sifting robotic from organic text: A natural language approach for detecting automation on Twitter , 2015, J. Comput. Sci..

[13]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[14]  Maurizio Tesconi,et al.  RTbust: Exploiting Temporal Patterns for Botnet Detection on Twitter , 2019, WebSci.

[15]  Le Song,et al.  Scalable Influence Estimation in Continuous-Time Diffusion Networks , 2013, NIPS.

[16]  Fabián Riquelme,et al.  Measuring user influence on Twitter: A survey , 2015, Inf. Process. Manag..

[17]  Swapnil Mishra,et al.  A Tutorial on Hawkes Processes for Events in Social Media , 2017, ArXiv.

[18]  A. Hawkes Spectra of some self-exciting and mutually exciting point processes , 1971 .

[19]  Tomas Mikolov,et al.  Advances in Pre-Training Distributed Word Representations , 2017, LREC.

[20]  Swapnil Mishra,et al.  Modeling Popularity in Asynchronous Social Media Streams with Recurrent Neural Networks , 2018, ICWSM.

[21]  Jean-Valère Cossu,et al.  A review of features for the discrimination of twitter users: application to the prediction of offline influence , 2015, Social Network Analysis and Mining.

[22]  Jürgen Knauth,et al.  Language-Agnostic Twitter-Bot Detection , 2019, RANLP.

[23]  Le Song,et al.  Influence Estimation and Maximization in Continuous-Time Diffusion Networks , 2016, ACM Trans. Inf. Syst..

[24]  Filippo Menczer,et al.  BotSlayer: real-time detection of bot amplification on Twitter , 2019, J. Open Source Softw..

[25]  Hamid R. Rabiee,et al.  RedQueen: An Online Algorithm for Smart Broadcasting in Social Networks , 2016, WSDM.

[26]  Filippo Menczer,et al.  BotOrNot: A System to Evaluate Social Bots , 2016, WWW.

[27]  Krishna P. Gummadi,et al.  Measuring User Influence in Twitter: The Million Follower Fallacy , 2010, ICWSM.

[28]  Filippo Menczer,et al.  Arming the public with artificial intelligence to counter social bots , 2019, Human Behavior and Emerging Technologies.

[29]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[30]  Jure Leskovec,et al.  SEISMIC: A Self-Exciting Point Process Model for Predicting Tweet Popularity , 2015, KDD.

[31]  Swapnil Mishra,et al.  Feature Driven and Point Process Approaches for Popularity Prediction , 2016, CIKM.