VASSL: A Visual Analytics Toolkit for Social Spambot Labeling

Social media platforms are filled with social spambots. Detecting these malicious accounts is essential, yet challenging, as they continually evolve to evade detection techniques. In this article, we present VASSL, a visual analytics system that assists in the process of detecting and labeling spambots. Our tool enhances the performance and scalability of manual labeling by providing multiple connected views and utilizing dimensionality reduction, sentiment analysis and topic modeling, enabling insights for the identification of spambots. The system allows users to select and analyze groups of accounts in an interactive manner, which enables the detection of spambots that may not be identified when examined individually. We present a user study to objectively evaluate the performance of VASSL users, as well as capturing subjective opinions about the usefulness and the ease of use of the tool.

[1]  Cynthia A. Brewer,et al.  ColorBrewer.org: An Online Tool for Selecting Colour Schemes for Maps , 2003 .

[2]  Daniel A. Keim,et al.  A Survey on Visual Analytics of Social Media Data , 2016, IEEE Transactions on Multimedia.

[3]  Yuanzhe Chen,et al.  Sequence Synopsis: Optimize Visual Summary of Temporal Event Data , 2018, IEEE Transactions on Visualization and Computer Graphics.

[4]  Ching-Yung Lin,et al.  TargetVue: Visual Analysis of Anomalous User Behaviors in Online Communication Systems , 2016, IEEE Transactions on Visualization and Computer Graphics.

[5]  Rosane Minghim,et al.  ATR-Vis , 2018, ACM Trans. Knowl. Discov. Data.

[6]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[7]  Gianluca Stringhini,et al.  Detecting spammers on social networks , 2010, ACSAC '10.

[8]  V. S. Subrahmanian,et al.  Using sentiment to detect bots on Twitter: Are humans more opinionated than bots? , 2014, 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014).

[9]  Kyumin Lee,et al.  Seven Months with the Devils: A Long-Term Study of Content Polluters on Twitter , 2011, ICWSM.

[10]  Filippo Menczer,et al.  BotOrNot: A System to Evaluate Social Bots , 2016, WWW.

[11]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[12]  Ben Shneiderman,et al.  The eyes have it: a task by data type taxonomy for information visualizations , 1996, Proceedings 1996 IEEE Symposium on Visual Languages.

[13]  Laura Schweitzer,et al.  Advances In Kernel Methods Support Vector Learning , 2016 .

[14]  Muhammad Abulaish,et al.  A generic statistical approach for spam detection in Online Social Networks , 2013, Comput. Commun..

[15]  Huan Liu,et al.  A new approach to bot detection: Striking the balance between precision and recall , 2016, 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[16]  David S. Ebert,et al.  Spatiotemporal social media analytics for abnormal event detection and examination using seasonal-trend decomposition , 2012, 2012 IEEE Conference on Visual Analytics Science and Technology (VAST).

[17]  Guanhua Yan,et al.  The Rise of Social Botnets: Attacks and Countermeasures , 2016, IEEE Transactions on Dependable and Secure Computing.

[18]  Filippo Menczer,et al.  Online Human-Bot Interactions: Detection, Estimation, and Characterization , 2017, ICWSM.

[19]  Yale Song,et al.  #FluxFlow: Visual Analysis of Anomalous Information Spreading on Social Media , 2014, IEEE Transactions on Visualization and Computer Graphics.

[20]  J. Hintze,et al.  Violin plots : A box plot-density trace synergism , 1998 .

[21]  Xiaoru Yuan,et al.  Social Media Visual Analytics , 2017, Comput. Graph. Forum.

[22]  G. McLachlan Discriminant Analysis and Statistical Pattern Recognition , 1992 .

[23]  Sushil Jajodia,et al.  Who is tweeting on Twitter: human, bot, or cyborg? , 2010, ACSAC '10.

[24]  Roberto Di Pietro,et al.  The Paradigm-Shift of Social Spambots: Evidence, Theories, and Tools for the Arms Race , 2017, WWW.

[25]  Fangzhao Wu,et al.  OpinionFlow: Visual Analysis of Opinion Diffusion on Social Media , 2014, IEEE Transactions on Visualization and Computer Graphics.

[26]  Amos Azaria,et al.  The DARPA Twitter Bot Challenge , 2016, Computer.

[27]  Thomas Ertl,et al.  Thematic Patterns in Georeferenced Tweets through Space-Time Visual Analytics , 2013, Computing in Science & Engineering.

[28]  Tamara Munzner,et al.  Empirical Guidance on Scatterplot and Dimension Reduction Technique Choices , 2013, IEEE Transactions on Visualization and Computer Graphics.

[29]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[30]  D. Williamson,et al.  The box plot: a simple visual method to interpret data. , 1989, Annals of internal medicine.

[31]  Fred D. Davis Perceived Usefulness, Perceived Ease of Use, and User Acceptance of Information Technology , 1989, MIS Q..

[32]  Daniel A. Keim,et al.  Progressive Learning of Topic Modeling Parameters: A Visual Analytics Framework , 2018, IEEE Transactions on Visualization and Computer Graphics.

[33]  Wolfgang Kienreich,et al.  On the Beauty and Usability of Tag Clouds , 2008, 2008 12th International Conference Information Visualisation.

[34]  Jarke J. van Wijk,et al.  Small Multiples, Large Singles: A New Approach for Visual Data Exploration , 2013, Comput. Graph. Forum.

[35]  Chao Yang,et al.  Empirical Evaluation and New Design for Fighting Evolving Twitter Spammers , 2011, IEEE Transactions on Information Forensics and Security.

[36]  Roberto Di Pietro,et al.  Social Fingerprinting: Detection of Spambot Groups Through DNA-Inspired Behavioral Modeling , 2017, IEEE Transactions on Dependable and Secure Computing.

[37]  Wei Hu,et al.  Twitter spammer detection using data stream clustering , 2014, Inf. Sci..

[38]  Filippo Menczer,et al.  The rise of social bots , 2014, Commun. ACM.

[39]  Jingrui He,et al.  RCLens: Interactive Rare Category Exploration and Identification , 2018, IEEE Transactions on Visualization and Computer Graphics.

[40]  Ross Maciejewski,et al.  A Visual Analytics Framework for Identifying Topic Drivers in Media Events , 2018, IEEE Transactions on Visualization and Computer Graphics.

[41]  Krishna P. Gummadi,et al.  Strength in Numbers: Robust Tamper Detection in Crowd Computations , 2015, COSN.

[42]  Athanasios V. Vasilakos,et al.  Understanding user behavior in online social networks: a survey , 2013, IEEE Communications Magazine.

[43]  Xiaoru Yuan,et al.  D-Map: Visual analysis of ego-centric information diffusion patterns in social media , 2016, 2016 IEEE Conference on Visual Analytics Science and Technology (VAST).

[44]  Andreas Kerren,et al.  The State of the Art in Sentiment Visualization , 2018, Comput. Graph. Forum.

[45]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .