Spammer detection based on Hidden Markov Model in micro-blogging

Over the past decade, social network becomes the main way for people to obtain information and communicate with their online friends. With the tremendous increase of social network, spammer also develops rapidly and spam information posted by spammer seriously affects people's daily life. Although the official executes some strict strategies for prevention, spammer evolves dynamically to adapt these strategies at the same time. They disguise as common users with posting interesting tweets and posting spam with irregular time. To deal with this problem, we present a method for spammer detection on the basis of Hidden Markov Model in micro-blogging. The proposed method classifies the user into spammer and non-spammer according to their major count-based features and behavior-based features. The usage of HMM allows us to describe the user behavior with hidden states dynamically, which can be decoded from several observed features. The experimental results based on real data collected from Sina micro-blogging site demonstrate the effectiveness of our method.

[1]  Philip S. Yu,et al.  Review spam detection via temporal pattern discovery , 2012, KDD.

[2]  Bing Liu,et al.  Opinion spam and analysis , 2008, WSDM '08.

[3]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[4]  Andrew J. Viterbi,et al.  Error bounds for convolutional codes and an asymptotically optimum decoding algorithm , 1967, IEEE Trans. Inf. Theory.

[5]  Virgílio A. F. Almeida,et al.  Detecting Spammers on Twitter , 2010 .

[6]  Shaik. AshaBee,et al.  Towards Online Spam Filtering In Social Networks , 2017 .

[7]  Padraig Cunningham,et al.  Network Analysis of Recurring YouTube Spam Campaigns , 2012, ICWSM.

[8]  Huan Liu,et al.  Leveraging knowledge across media for spammer detection in microblogging , 2014, SIGIR.

[9]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[10]  Srinivasan Venkatesh,et al.  Battling the Internet water army: Detection of hidden paid posters , 2011, 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013).

[11]  Leman Akoglu,et al.  Discovering Opinion Spammer Groups by Network Footprints , 2015, ECML/PKDD.

[12]  Gianluca Stringhini,et al.  Detecting spammers on social networks , 2010, ACSAC '10.

[13]  Xiaokang Yang,et al.  Analysis and identification of spamming behaviors in Sina Weibo microblog , 2013, SNAKDD '13.

[14]  Emanuele Della Valle,et al.  An Introduction to Information Retrieval , 2013 .

[15]  Arvind Krishnamurthy,et al.  Studying Spamming Botnets Using Botlab , 2009, NSDI.

[16]  Gordon V. Cormack,et al.  Feature engineering for mobile (SMS) spam filtering , 2007, SIGIR.

[17]  Philip S. Yu,et al.  Review Graph Based Online Store Review Spammer Detection , 2011, 2011 IEEE 11th International Conference on Data Mining.

[18]  S. Eddy Hidden Markov models. , 1996, Current opinion in structural biology.

[19]  Huan Liu,et al.  Online Social Spammer Detection , 2014, AAAI.

[20]  Arjun Mukherjee,et al.  Spotting fake reviewer groups in consumer reviews , 2012, WWW.

[21]  Christos Faloutsos,et al.  CatchSync: catching synchronized behavior in large directed graphs , 2014, KDD.

[22]  M. Chuah,et al.  Spam Detection on Twitter Using Traditional Classifiers , 2011, ATC.

[23]  Vern Paxson,et al.  @spam: the underground on 140 characters or less , 2010, CCS '10.

[24]  Huan Liu,et al.  Social Spammer Detection with Sentiment Information , 2014, 2014 IEEE International Conference on Data Mining.