论文信息 - Real-Time Twitter Content Polluter Detection Based on Direct Features

Real-Time Twitter Content Polluter Detection Based on Direct Features

Too many content polluters on social networks make it difficult for users to browse valuable contents. Some research has been done in spam and phishing detection on social networks but these are only a small part of all content polluters. What bother users most are those large amount of repeated low quality advertisements. Hence it is necessary to filter these content polluters to improve users' experiences. Moreover, most of the phishing/spam detection works are done offline and some of the features used take too much time to extract making it impossible for real-time detection. We perform a study on an extensive twitter dataset and present a definition of content polluters. We further propose some novel features and together with other commonly used features in phishing/spam detection, we classify them into two categories - direct features and indirect features. A simple random forest classifier is applied based on our proposed direct features alone for real-time content polluter detection and it achieves a reasonable high accuracy with high F1 values.

Chiew Tong Lau | Chai Kiat Yeo | Bu Sung Lee | Weiling Chen

[1] Minaxi Gupta,et al. Twitter games: how successful spammers pick targets , 2012, ACSAC '12.

[2] Vern Paxson,et al. @spam: the underground on 140 characters or less , 2010, CCS '10.

[3] Dawn Xiaodong Song,et al. Suspended accounts in retrospect: an analysis of twitter spam , 2011, IMC '11.

[4] Alex Pentland,et al. Twitter: who gets caught? observed trends in social micro-blogging spam , 2014, WebSci '14.

[5] Sushil Jajodia,et al. Detecting Automation of Twitter Accounts: Are You a Human, Bot, or Cyborg? , 2012, IEEE Transactions on Dependable and Secure Computing.

[6] Kyumin Lee,et al. Uncovering social spammers: social honeypots + machine learning , 2010, SIGIR.

[7] Chao Yang,et al. Empirical Evaluation and New Design for Fighting Evolving Twitter Spammers , 2013, IEEE Trans. Inf. Forensics Secur..

[8] Po-Ching Lin,et al. A study of effective features for detecting long-surviving Twitter spam accounts , 2013, 2013 15th International Conference on Advanced Communications Technology (ICACT).

[9] Wei Hu,et al. Twitter spammer detection using data stream clustering , 2014, Inf. Sci..

[10] Chao Yang,et al. Empirical Evaluation and New Design for Fighting Evolving Twitter Spammers , 2011, IEEE Transactions on Information Forensics and Security.