Detecting Internet Hidden Paid Posters Based on Group and Individual Characteristics

Online social networks are popular communication tools for billions of users. Unfortunately, they are also effective tools for hidden paid posters (or Internet water army in some literatures) to propagate spam or mendacious messages. Paid posters are typically organized in groups to post with specific purposes and have flooded the communities of microblogging websites. Typical traditional methods only utilize individual characteristics in detecting them. In this paper, we study the group characteristics of paid posters and find that group characteristics are also very important in detecting them comparing to individual characteristics. We construct a classifier based on both the individual and group characteristics to detect paid posters. Extensive experiments show that our method is better than existing methods.

[1]  Cheng Hong,et al.  RGB-Depth feature for 3D human activity recognition , 2013, China Communications.

[2]  Ke Zeng,et al.  Behavior Modeling of Internet Water Army in Online Forums , 2014 .

[3]  Vern Paxson,et al.  @spam: the underground on 140 characters or less , 2010, CCS '10.

[4]  Dawn Xiaodong Song,et al.  Suspended accounts in retrospect: an analysis of twitter spam , 2011, IMC '11.

[5]  Srinivasan Venkatesh,et al.  Battling the Internet water army: Detection of hidden paid posters , 2011, 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013).

[6]  Sushil Jajodia,et al.  Who is tweeting on Twitter: human, bot, or cyborg? , 2010, ACSAC '10.

[7]  Hector Garcia-Molina,et al.  Combating Web Spam with TrustRank , 2004, VLDB.

[8]  Constantine D. Spyropoulos,et al.  An experimental comparison of naive Bayesian and keyword-based anti-spam filtering with personal e-mail messages , 2000, SIGIR '00.

[9]  Harris Drucker,et al.  Support vector machines for spam categorization , 1999, IEEE Trans. Neural Networks.

[10]  Guofei Gu,et al.  Analyzing spammers' social networks for fun and profit: a case study of cyber criminal ecosystem on twitter , 2012, WWW.

[11]  Yang Xiao,et al.  Detection of Internet Water Army in Social Network , 2014, INFOCOM 2014.

[12]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[13]  Enrico Blanzieri,et al.  A survey of learning-based techniques of email spam filtering , 2008, Artificial Intelligence Review.

[14]  Jia Yan,et al.  Mining topical influencers based on the multi-relational network in micro-blogging sites , 2013, China Communications.

[15]  Bing Liu,et al.  Opinion spam and analysis , 2008, WSDM '08.

[16]  Tim Oates,et al.  Detecting Spam Blogs: A Machine Learning Approach , 2006, AAAI.

[17]  Claire Cardie,et al.  Finding Deceptive Opinion Spam by Any Stretch of the Imagination , 2011, ACL.

[18]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[19]  Virgílio A. F. Almeida,et al.  Detecting Spammers on Twitter , 2010 .

[20]  Virgílio A. F. Almeida,et al.  Understanding video interactions in youtube , 2008, ACM Multimedia.

[21]  Jun Hu,et al.  Detecting and characterizing social spam campaigns , 2010, CCS '10.

[22]  Virgílio A. F. Almeida,et al.  Identifying video spammers in online social networks , 2008, AIRWeb '08.

[23]  Adam Thomason Blog Spam: A Review , 2007, CEAS.

[24]  Marc Najork,et al.  Spam, damn spam, and statistics: using statistical analysis to locate spam web pages , 2004, WebDB '04.

[25]  M. Chuah,et al.  Spam Detection on Twitter Using Traditional Classifiers , 2011, ATC.

[26]  Kyumin Lee,et al.  Uncovering social spammers: social honeypots + machine learning , 2010, SIGIR.

[27]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[28]  Hui Wang,et al.  What scale of audience a campaign can reach in what price on Twitter? , 2014, IEEE INFOCOM 2014 - IEEE Conference on Computer Communications.