Forum-Oriented Research on Water Army Detection for Bursty Topics

Water army means a special group of online users who get paid for posting comments and new threads or articles on different online communities and websites for some hidden purposes. Due to the fact that the nature of the posting behavior of water army is not fully and understood, the driving force detection of the bursty topic for web forum is still a difficult problem to solve. According to the analysis of bursty topics evolution and the posting behavior of water army, it is found that the topics driven by water army exhibit the characteristics different from general topics in their latency stage. Based on this discovery, the paper proposes a novel bursty topic classification algorithm, based on SVM active learning, which transforms the water army detection issue to a SVM-based classification decision issue. The experimental results show that the proposed algorithm has higher detection accuracy and detection efficiency.

[1]  Srinivasan Venkatesh,et al.  Battling the Internet water army: Detection of hidden paid posters , 2011, 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013).

[2]  Steven Myers,et al.  Prevalence and mitigation of forum spamming , 2011, 2011 Proceedings IEEE INFOCOM.

[3]  Archana Bhattarai,et al.  Characterizing comment spam in the blogosphere through content analysis , 2009, 2009 IEEE Symposium on Computational Intelligence in Cyber Security.

[4]  Hao Chen,et al.  A Quantitative Study of Forum Spamming Using Context-based Analysis , 2007, NDSS.

[5]  Cai Wan-dong The Web Forum Crawling Technology and System Implementation , 2011 .