Behavior and Social Computing

In this paper, we systematically explore an itemset-based extension approach for generating candidate sequence which contributes to a better and more straightforward search space traversal performance than traditional itembased extension approach. Based on this candidate generation approach, we present FINDER, a novel algorithm for discovering the set of all frequent sequences. FINDER is composed of two separated steps. In the first step, all frequent itemsets are discovered and we can get great benefit from existing efficient itemset mining algorithms. In the second step, all frequent sequences with at least two frequent itemsets are detected by combining depth-first search and itemset-based extension candidate generation together. A vertical bitmap data representation is adopted for rapidly support counting reason. Several pruning strategies are used to reduce the search space and minimize cost of computation. An extensive set of experiments demonstrate the effectiveness and the linear scalability of proposed algorithm.

[1]  Philip S. Yu,et al.  Review Graph Based Online Store Review Spammer Detection , 2011, 2011 IEEE 11th International Conference on Data Mining.

[2]  Philip S. Yu,et al.  Review spam detection via temporal pattern discovery , 2012, KDD.

[3]  Guofei Gu,et al.  Analyzing spammers' social networks for fun and profit: a case study of cyber criminal ecosystem on twitter , 2012, WWW.

[4]  Qiang Yang,et al.  Discovering Spammers in Social Networks , 2012, AAAI.

[5]  Ming-Wei Chang,et al.  Partitioned logistic regression for spam filtering , 2008, KDD.

[6]  Ee-Peng Lim,et al.  Detecting product review spammers using rating behaviors , 2010, CIKM.

[7]  Claire Cardie,et al.  Finding Deceptive Opinion Spam by Any Stretch of the Imagination , 2011, ACL.

[8]  Kyumin Lee,et al.  Uncovering social spammers: social honeypots + machine learning , 2010, SIGIR.

[9]  Virgílio A. F. Almeida,et al.  Detecting Spammers and Content Promoters in Online Video Social Networks , 2009, IEEE INFOCOM Workshops 2009.

[10]  Longbing Cao,et al.  In-depth behavior understanding and use: The behavior informatics approach , 2010, Inf. Sci..

[11]  Xianchao Zhang,et al.  Detecting Spam and Promoting Campaigns in the Twitter Social Network , 2012, 2012 IEEE 12th International Conference on Data Mining.

[12]  Bing Liu,et al.  Opinion spam and analysis , 2008, WSDM '08.

[13]  Gordon V. Cormack,et al.  Spam filter evaluation with imprecise ground truth , 2009, SIGIR.

[14]  Lin Liu,et al.  Detecting Spam in Chinese Microblogs - A Study on Sina Weibo , 2012, 2012 Eighth International Conference on Computational Intelligence and Security.

[15]  Norman M. Sadeh,et al.  Learning to detect phishing emails , 2007, WWW '07.

[16]  Arjun Mukherjee,et al.  Spotting fake reviewer groups in consumer reviews , 2012, WWW.