A hybrid approach for spam detection for Twitter

Online social networks (OSNs) are becoming extremely popular among Internet users as they spend significant amount of time on popular social networking sites like Facebook, Twitter and Google+. These sites are turning out to be fundamentally pervasive and are developing a communication channel for billions of users. Online community use them to find new friends, update their existing friends list with their latest thoughts and activities. Huge information available on these sites attracts the interest of cyber criminals who misuse these sites to exploit vulnerabilities for their illicit benefits such as advertising some product or to attract victims to click on malicious links or infecting users system just for the purpose of making money. Spam detection is one of the major problems these days in social networking sites such as twitter. Most previous techniques use different set of features to classify spam and non-spam users. In this paper, we proposed a hybrid technique which uses content-based as well as graph-based features for identification of spammers on twitter platform. We have analysed the proposed technique on real Twitter dataset with 11k uses and more than 400k tweets approximately. Our results show that the detection rate of our proposed technique is much higher than any of the existing techniques.

[1]  Po-Ching Lin,et al.  A study of effective features for detecting long-surviving Twitter spam accounts , 2013, 2013 15th International Conference on Advanced Communications Technology (ICACT).

[2]  Raabia Asif,et al.  Finding most collaborating mathematicians a co-author network analysis of mathematics domain , 2016, 2016 International Conference on Computing, Electronic and Electrical Engineering (ICE Cube).

[3]  Muhammad Arshad Islam,et al.  Evaluation of graph centrality measures for tweet classification , 2016, 2016 International Conference on Computing, Electronic and Electrical Engineering (ICE Cube).

[4]  Divya,et al.  Techniques to Detect Spammers in Twitter- A Survey , 2014 .

[5]  Lisa Singh,et al.  Can Friends Be Trusted? Exploring Privacy in Online Social Networks , 2009, 2009 International Conference on Advances in Social Network Analysis and Mining.

[6]  M. Chuah,et al.  Spam Detection on Twitter Using Traditional Classifiers , 2011, ATC.

[7]  Gang Zhou,et al.  A multifrequency MAC specially designed for wireless sensor network applications , 2010, TECS.

[8]  Chao Yang,et al.  Empirical Evaluation and New Design for Fighting Evolving Twitter Spammers , 2011, IEEE Transactions on Information Forensics and Security.

[9]  Zhigang Cao,et al.  Hot Social Events on SinaWeibo , 2013, ArXiv.

[10]  Giannis Tzimas,et al.  Large Scale Sentiment Analysis on Twitter with Spark , 2016, EDBT/ICDT Workshops.

[11]  Virgílio A. F. Almeida,et al.  Detecting Spammers on Twitter , 2010 .

[12]  D. Watts,et al.  Influentials, Networks, and Public Opinion Formation , 2007 .

[13]  Leonard M. Freeman,et al.  A set of measures of centrality based upon betweenness , 1977 .

[14]  Tara C. Marshall,et al.  Facebook Surveillance of Former Romantic Partners: Associations with PostBreakup Recovery and Personal Growth , 2012, Cyberpsychology Behav. Soc. Netw..

[15]  Gang Wang,et al.  Follow the green: growth and dynamics in twitter follower markets , 2013, Internet Measurement Conference.

[16]  Krishna P. Gummadi,et al.  Measuring User Influence in Twitter: The Million Follower Fallacy , 2010, ICWSM.

[17]  Jong Kim,et al.  WarningBird: Detecting Suspicious URLs in Twitter Stream , 2012, NDSS.

[18]  Sushil Jajodia,et al.  Detecting Automation of Twitter Accounts: Are You a Human, Bot, or Cyborg? , 2012, IEEE Transactions on Dependable and Secure Computing.