A Hybrid Feature Selection Method Based on Harmony Search

Spammer as the release of spam in social networks has seriously affected the other users' experience in social networking platform. Micro-blog has a huge amount of registered account, and a large number of users' behaviors have brought some difficulties for the detection of spammer. So the feature selection has become a primary problem of detecting spammer. This paper puts forward a hybrid feature selection method based on harmony search (HS)and ReliefF, so this method is called ReF-HS. Because HS is a heuristic algorithm, and ReliefF is a feature selection algorithm based on filter. This method considers not only the measure of single feature but also the correlation between the features. Compared with traditional method, this method converges quickly and is better to avoid the local optimal. Experimental results show that the feature subset selected by ReF-HS is small in size, and the selected feature subset leads to a higher accuracy rate of spam detection.

[1]  Larry A. Rendell,et al.  The Feature Selection Problem: Traditional Methods and a New Algorithm , 1992, AAAI.

[2]  Zong Woo Geem,et al.  A New Heuristic Optimization Algorithm: Harmony Search , 2001, Simul..

[3]  M. Chuah,et al.  Spam Detection on Twitter Using Traditional Classifiers , 2011, ATC.

[4]  Hossein Nezamabadi-pour,et al.  GA-based feature subset selection in a spam/non-spam detection system , 2012, 2012 International Conference on Computer and Communication Engineering (ICCCE).

[5]  Mengjie Zhang,et al.  Binary particle swarm optimisation for feature selection: A filter based approach , 2012, 2012 IEEE Congress on Evolutionary Computation.

[6]  Gang Wang,et al.  Follow the green: growth and dynamics in twitter follower markets , 2013, Internet Measurement Conference.

[7]  Yudong Zhang,et al.  Binary PSO with mutation operator for feature selection using decision tree applied to spam detection , 2014, Knowl. Based Syst..

[8]  A. Ramesh,et al.  Binary Bat Approach for Effective Spam Classification in Online Social Networks , 2014 .

[9]  Emiliano De Cristofaro,et al.  Paying for Likes?: Understanding Facebook Like Fraud Using Honeypots , 2014, Internet Measurement Conference.

[10]  Sanjeev Dhawan,et al.  Spam Detection in Social Networks Using Correlation Based Feature Subset Selection , 2015 .

[11]  Jun Zhang,et al.  A Performance Evaluation of Machine Learning-Based Streaming Spam Tweets Detection , 2015, IEEE Transactions on Computational Social Systems.

[12]  Muhammad Abulaish,et al.  Classifier Ensembles Using Structural Features For Spammer Detection In Online Social Networks , 2015 .

[13]  Zheyi Chen,et al.  Detecting spammers on social networks , 2015, Neurocomputing.

[14]  Feng Gao,et al.  Detecting cooperative and organized spammer groups in micro-blogging community , 2016, Data Mining and Knowledge Discovery.

[15]  Xiao Wang,et al.  VoteTrust: Leveraging Friend Invitation Graph to Defend against Social Network Sybils , 2016, IEEE Transactions on Dependable and Secure Computing.

[16]  Jie Zhao,et al.  Towards Spammer Detection in Microblogging Platforms , 2016 .