LSSL-SSD: Social Spammer Detection with Laplacian Score and Semi-supervised Learning

The rapid development of social networks makes it easy for people to communicate online. However, social networks usually suffer from social spammers due to their openness. Spammers deliver information for economic purposes, and they pose threats to the security of social networks. To maintain the long-term running of online social networks, many detection methods are proposed. But current methods normally use high dimension features with supervised learning algorithms to find spammers, resulting in low detection performance. To solve this problem, in this paper, we first apply the Laplacian score method, which is an unsupervised feature selection method, to obtain useful features. Based on the selected features, the semi-supervised ensemble learning is then used to train the detection model. Experimental results on the Twitter dataset show the efficiency of our approach after feature selection. Moreover, the proposed method remains high detection performance in the face of limited labeled data.

[1]  Yamir Moreno,et al.  Locating privileged spreaders on an online social network. , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[2]  Xiang Zhu,et al.  Spammer Detection on Online Social Networks Based on Logistic Regression , 2015, WAIM Workshops.

[3]  Fangzhao Wu,et al.  Social Spammer and Spam Message Co-Detection in Microblogging with Social Context Regularization , 2015, CIKM.

[4]  Virgílio A. F. Almeida,et al.  Detecting Spammers on Twitter , 2010 .

[5]  Jong Kim,et al.  Spam Filtering in Twitter Using Sender-Receiver Relationship , 2011, RAID.

[6]  Huan Liu,et al.  Online Social Spammer Detection , 2014, AAAI.

[7]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[8]  Jussara M. Almeida,et al.  Detection of spam tipping behaviour on foursquare , 2013, WWW.

[9]  Yi Zhang,et al.  Discover millions of fake followers in Weibo , 2016, Social Network Analysis and Mining.

[10]  Gianluca Stringhini,et al.  Detecting spammers on social networks , 2010, ACSAC '10.

[11]  Lifeng Sun,et al.  Joint Social and Content Recommendation for User-Generated Videos in Online Social Network , 2013, IEEE Transactions on Multimedia.

[12]  Zengyou He,et al.  A Semi-Supervised Framework for Social Spammer Detection , 2015, PAKDD.

[13]  Zhi-Hua Zhou,et al.  Improve Computer-Aided Diagnosis With Machine Learning Techniques Using Undiagnosed Samples , 2007, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[14]  Huan Liu,et al.  Social Spammer Detection with Sentiment Information , 2014, 2014 IEEE International Conference on Data Mining.

[15]  Kyumin Lee,et al.  Uncovering social spammers: social honeypots + machine learning , 2010, SIGIR.

[16]  Deng Cai,et al.  Laplacian Score for Feature Selection , 2005, NIPS.

[17]  Songqing Chen,et al.  UNIK: unsupervised social network spam detection , 2013, CIKM.

[18]  Jun Hu,et al.  Detecting and characterizing social spam campaigns , 2010, IMC '10.

[19]  Georgia Koutrika,et al.  Fighting Spam on Social Web Sites: A Survey of Approaches and Future Challenges , 2007, IEEE Internet Computing.

[20]  Cécile Favre,et al.  Information diffusion in online social networks: a survey , 2013, SGMD.

[21]  Donghong Ji,et al.  Finding Deceptive Opinion Spam by Correcting the Mislabeled Instances , 2015 .