A new approach to bot detection: Striking the balance between precision and recall

The presence of bots has been felt in many aspects of social media. Twitter, one example of social media, has especially felt the impact, with bots accounting for a large portion of its users. These bots have been used for malicious tasks such as spreading false information about political candidates and inflating the perceived popularity of celebrities. Furthermore, these bots can change the results of common analyses performed on social media. It is important that researchers and practitioners have tools in their arsenal to remove them. Approaches exist to remove bots, however they focus on precision to evaluate their model at the cost of recall. This means that while these approaches are almost always correct in the bots they delete, they ultimately delete very few, thus many bots remain. We propose a model which increases the recall in detecting bots, allowing a researcher to delete more bots. We evaluate our model on two real-world social media datasets and show that our detection algorithm removes more bots from a dataset than current approaches.

[1]  Ingmar Weber,et al.  Twitter: A Digital Socioscope , 2015 .

[2]  Konstantin Beznosov,et al.  Design and analysis of a social botnet , 2013, Comput. Networks.

[3]  Geoff Hulten,et al.  Spamming botnets: signatures and characteristics , 2008, SIGCOMM '08.

[4]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[5]  Huan Liu,et al.  Twitter Data Analytics , 2013, SpringerBriefs in Computer Science.

[6]  Marc Dacier,et al.  A strategic analysis of spam botnets operations , 2011, CEAS '11.

[7]  Vern Paxson,et al.  Adapting Social Spam Infrastructure for Political Censorship , 2012, LEET.

[8]  Jacob Ratkiewicz,et al.  Truthy: mapping the spread of astroturf in microblog streams , 2010, WWW.

[9]  Gianluca Stringhini,et al.  The Underground Economy of Spam: A Botmaster's Perspective of Coordinating Large-Scale Spam Campaigns , 2011, LEET.

[10]  Huan Liu,et al.  The fragility of Twitter social networks against suspended users , 2015, 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[11]  Benjamin Waugh,et al.  Twitter Deception and Influence: Issues of Identity, Slacktivism, and Puppetry , 2014 .

[12]  Chris Kanich,et al.  Spamalytics: an empirical analysis of spam marketing conversion , 2008, CCS.

[13]  Arvind Krishnamurthy,et al.  Studying Spamming Botnets Using Botlab , 2009, NSDI.

[14]  Alex Hai Wang,et al.  Detecting Spam Bots in Online Social Networking Sites: A Machine Learning Approach , 2010, DBSec.

[15]  Jacob Ratkiewicz,et al.  Detecting and Tracking the Spread of Astroturf Memes in Microblog Streams , 2010, ArXiv.

[16]  Reza Zafarani,et al.  10 Bits of Surprise: Detecting Malicious Users with Minimum Information , 2015, CIKM.

[17]  Vern Paxson,et al.  @spam: the underground on 140 characters or less , 2010, CCS '10.

[18]  Dawn Xiaodong Song,et al.  Suspended accounts in retrospect: an analysis of twitter spam , 2011, IMC '11.

[19]  Sushil Jajodia,et al.  Who is tweeting on Twitter: human, bot, or cyborg? , 2010, ACSAC '10.

[20]  Jong Kim,et al.  Early filtering of ephemeral malicious accounts on Twitter , 2014, Comput. Commun..

[21]  Reza Zafarani,et al.  Social Media Mining: An Introduction , 2014 .

[22]  Kyumin Lee,et al.  Seven Months with the Devils: A Long-Term Study of Content Polluters on Twitter , 2011, ICWSM.

[23]  Konstantin Beznosov,et al.  The socialbot network: when bots socialize for fame and money , 2011, ACSAC '11.

[24]  Jacob Ratkiewicz,et al.  Detecting and Tracking Political Abuse in Social Media , 2011, ICWSM.