Bot2Vec: A general approach of intra-community oriented representation learning for bot detection in different types of social networks

Abstract Recently, due to the rapid growth of o nline s ocial n etworks (OSNs) such as Facebook, Twitter, Weibo, etc. the number of machine accounts/social bots that mimic human users has increased. Along with the development of a rtificial i ntelligence (AI), social bots are designed to become smarter and more sophisticated in their efforts at replicating the normal behaviors of human accounts. Constructing reliable and effective bot detection mechanisms is this considered crucial to keep OSNs clean and safe for users. Despite the rapid development of social bot detection platforms, recent state-of-the-art systems still encounter challenges which are related to the model’s generalization (and whether it can be adaptable for multiple types of OSNs) as well as the great efforts needed for feature engineering. In this paper, we propose a novel approach of applying network representation learning (NRL) to bot/spammer detection, called Bot2Vec. Our proposed Bot2Vec model is designed to automatically preserve both local neighborhood relations and the intra-community structure of user nodes while learning the representation of given OSNs, without using any extra features based on the user’s profile. By applying the intra-community random walk strategy, Bot2Vec promises to achieve better user node embedding outputs than recent state-of-the-art network embedding baselines for bot detection tasks. Extensive experiments on two different types of real-word social networks (Twitter and Tagged) demonstrate the effectiveness of our proposed model. The source code for implementing the Bot2Vec model is available at: https://github.com/phamtheanhphu/bot2vec

[1]  Daniel Dajun Zeng,et al.  HiWalk: Learning node embeddings from heterogeneous networks , 2019, Inf. Syst..

[2]  Emilio Ferrara,et al.  Bots increase exposure to negative and inflammatory content in online social systems , 2018, Proceedings of the National Academy of Sciences.

[3]  Sanjay Singh,et al.  Detection of fake Twitter followers using graph centrality measures , 2016, 2016 2nd International Conference on Contemporary Computing and Informatics (IC3I).

[4]  Emilio Ferrara,et al.  Deep Neural Networks for Bot Detection , 2018, Inf. Sci..

[5]  Christos Faloutsos,et al.  Catching Synchronized Behaviors in Large Networks , 2016, ACM Trans. Knowl. Discov. Data.

[6]  James R. Foulds,et al.  Collective Spammer Detection in Evolving Multi-Relational Social Networks , 2015, KDD.

[7]  Hossein Hamooni,et al.  Temporal Patterns in Bot Activities , 2017, WWW.

[8]  Hao Wang,et al.  PME: Projected Metric Embedding on Heterogeneous Networks for Link Prediction , 2018, KDD.

[9]  Maurizio Tesconi,et al.  RTbust: Exploiting Temporal Patterns for Botnet Detection on Twitter , 2019, WebSci.

[10]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[11]  Roberto Di Pietro,et al.  DNA-Inspired Online Behavioral Modeling and Its Application to Spambot Detection , 2016, IEEE Intell. Syst..

[12]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[13]  Jinyuan Jia,et al.  Random Walk Based Fake Account Detection in Online Social Networks , 2017, 2017 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[14]  Filippo Menczer,et al.  BotOrNot: A System to Evaluate Social Bots , 2016, WWW.

[15]  Richard Bonneau,et al.  Detecting Bots on Russian Political Twitter , 2017, Big Data.

[16]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[17]  Daniel Dajun Zeng,et al.  Behavior enhanced deep bot detection in social media , 2017, 2017 IEEE International Conference on Intelligence and Security Informatics (ISI).

[18]  Roberto Di Pietro,et al.  Social Fingerprinting: Detection of Spambot Groups Through DNA-Inspired Behavioral Modeling , 2017, IEEE Transactions on Dependable and Secure Computing.

[19]  Filippo Menczer,et al.  Arming the public with artificial intelligence to counter social bots , 2019, Human Behavior and Emerging Technologies.

[20]  Konstantin Beznosov,et al.  Integro: Leveraging Victim Prediction for Robust Fake Account Detection in OSNs , 2015, NDSS.

[21]  Nitesh V. Chawla,et al.  SMOTEBoost: Improving Prediction of the Minority Class in Boosting , 2003, PKDD.

[22]  Roberto Di Pietro,et al.  Fame for sale: Efficient detection of fake Twitter followers , 2015, Decis. Support Syst..

[23]  Nitesh V. Chawla,et al.  metapath2vec: Scalable Representation Learning for Heterogeneous Networks , 2017, KDD.

[24]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[25]  Fabio Persia,et al.  Recognizing human behaviours in online social networks , 2018, Comput. Secur..

[26]  Chong-kwon Kim,et al.  The Social Relation Key: A new paradigm for security , 2017, Inf. Syst..

[27]  Chong-kwon Kim,et al.  Distance-based customer detection in fake follower markets , 2019, Inf. Syst..

[28]  Emilio Ferrara,et al.  Disinformation and Social Bot Operations in the Run Up to the 2017 French Presidential Election , 2017, First Monday.

[29]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[30]  Jon Crowcroft,et al.  Classification of Twitter Accounts into Automated Agents and Human Users , 2017, 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[31]  Nazar Zaki,et al.  Detecting Social Bots on Twitter: A Literature Review , 2018, 2018 International Conference on Innovations in Information Technology (IIT).