On the capability of evolved spambots to evade detection via genetic engineering

Abstract Since decades, genetic algorithms have been used as an effective heuristic to solve optimization problems. However, in order to be applied, genetic algorithms may require a string-based genetic encoding of information, which severely limited their applicability when dealing with online accounts. Remarkably, a behavioral modeling technique inspired by biological DNA has been recently proposed – and successfully applied – for monitoring and detecting spambots in Online Social Networks. In this so-called digital DNA representation, the behavioral lifetime of an account is encoded as a sequence of characters, namely a digital DNA sequence. In a previous work, the authors proposed to create synthetic digital DNA sequences that resemble the characteristics of the digital DNA sequences of real accounts. The combination of (i) the capability to model the accounts’ behaviors as digital DNA sequences, (ii) the possibility to create synthetic digital DNA sequences, and (iii) the evolutionary simulations allowed by genetic algorithms, open up the unprecedented opportunity to study – and even anticipate – the evolutionary patterns of modern social spambots. In this paper, we experiment with a novel ad-hoc genetic algorithm that allows to obtain behaviorally evolved spambots. By varying the different parameters of the genetic algorithm, we are able to evaluate the capability of the evolved spambots to escape a state-of-art behavior-based detection technique. Notably, despite such detection technique achieved excellent performances in the recent past, a number of our spambot evolutions manage to escape detection. Our analysis, if carried out at large-scale, would allow to proactively identify possible spambot evolutions capable of evading current detection techniques.

[1]  Longbing Cao,et al.  In-depth behavior understanding and use: The behavior informatics approach , 2010, Inf. Sci..

[2]  Roberto Di Pietro,et al.  The Paradigm-Shift of Social Spambots: Evidence, Theories, and Tools for the Arms Race , 2017, WWW.

[3]  Chao Yang,et al.  Empirical Evaluation and New Design for Fighting Evolving Twitter Spammers , 2011, IEEE Transactions on Information Forensics and Security.

[4]  Filippo Menczer,et al.  The rise of social bots , 2014, Commun. ACM.

[5]  Roberto Di Pietro,et al.  DNA-Inspired Online Behavioral Modeling and Its Application to Spambot Detection , 2016, IEEE Intell. Syst..

[6]  Christos Faloutsos,et al.  Catching Synchronized Behaviors in Large Networks , 2016, ACM Trans. Knowl. Discov. Data.

[7]  Roberto Di Pietro,et al.  Social Fingerprinting: Detection of Spambot Groups Through DNA-Inspired Behavioral Modeling , 2017, IEEE Transactions on Dependable and Secure Computing.

[8]  Nello Cristianini,et al.  Nowcasting Events from the Social Web with Statistical Learning , 2012, TIST.

[9]  Filippo Menczer,et al.  Detection of Promoted Social Media Campaigns , 2016, ICWSM.

[10]  Yun Fu,et al.  Prediction of Human Activity by Discovering Temporal Sequence Patterns , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Piet Van Mieghem,et al.  Lognormal Infection Times of Online Information Spread , 2013, PloS one.

[12]  Roberto Di Pietro,et al.  Fame for sale: Efficient detection of fake Twitter followers , 2015, Decis. Support Syst..

[13]  Enno Ohlebusch,et al.  Linear Time Algorithms for Generalizations of the Longest Common Substring Problem , 2011, Algorithmica.

[14]  Jacob Ratkiewicz,et al.  Detecting and Tracking Political Abuse in Social Media , 2011, ICWSM.

[15]  Lucas Chi Kwong Hui,et al.  Color Set Size Problem with Application to String Matching , 1992, CPM.

[16]  Roberto Di Pietro,et al.  Exploiting Digital DNA for the Analysis of Similarities in Twitter Behaviours , 2017, 2017 IEEE International Conference on Data Science and Advanced Analytics (DSAA).

[17]  Huan Liu,et al.  Online Social Spammer Detection , 2014, AAAI.

[18]  Christian Doerr,et al.  Lognormal distribution in the digg online social network , 2011 .

[19]  Bernardo A. Huberman,et al.  Long trend dynamics in social media , 2011, EPJ Data Science.

[20]  Jiawei Han,et al.  Uncovering deception in social media , 2014, Social Network Analysis and Mining.

[21]  Yiannis Kompatsiaris,et al.  Predicting Elections for Multiple Countries Using Twitter and Polls , 2015, IEEE Intelligent Systems.

[22]  Maurizio Tesconi,et al.  Hybrid Crowdsensing: A Novel Paradigm to Combine the Strengths of Opportunistic and Participatory Crowdsensing , 2017, WWW.

[23]  Fabrizio Lillo,et al.  $FAKE: Evidence of Spam and Bot Activity in Stock Microblogs on Twitter , 2018, ICWSM.

[24]  Angelo Spognardi,et al.  From Reaction to Proaction: Unexplored Ways to the Detection of Evolving Spambots , 2018, WWW.