Spam Filtering in Social Networks Using Regularized Deep Neural Networks with Ensemble Learning

Spam filtering in social networks is increasingly important owing to the rapid growth of social network user base. Sophisticated spam filters must be developed to deal with this complex problem. Traditional machine learning approaches such as neural networks, support vector machine and Naive Bayes classifiers are not effective enough to process and utilize complex features present in high-dimensional data on social network spam. To overcome this problem, here we propose a novel approach to social network spam filtering. The approach uses ensemble learning techniques with regularized deep neural networks as base learners. We demonstrate that this approach is effective for social network spam filtering on a benchmark dataset in terms of accuracy and area under ROC. In addition, solid performance is achieved in terms of false negative and false positive rates. We also show that the proposed approach outperforms other popular algorithms used in spam filtering, such as decision trees, Naive Bayes, artificial immune systems, support vector machines, etc.

[1]  Shyamanta M. Hazarika,et al.  E-Mail Spam Filtering: A Review of Techniques and Trends , 2018 .

[2]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[3]  Xianghan Zheng,et al.  ELM-based spammer detection in social networks , 2016, The Journal of Supercomputing.

[4]  Inderjit S. Dhillon,et al.  A Divisive Information-Theoretic Feature Clustering Algorithm for Text Classification , 2003, J. Mach. Learn. Res..

[5]  Prabhjot Kaur,et al.  Spam detection on Twitter: A survey , 2016, 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom).

[6]  Fenglong Ma,et al.  Discovering social spammers from multiple views , 2017, Neurocomputing.

[7]  Nan Chen,et al.  Constrained NMF-based semi-supervised learning for social media spammer detection , 2017, Knowl. Based Syst..

[8]  Asit Kumar Das,et al.  Attribute selection for improving spam classification in online social networks: a rough set theory-based approach , 2017, Social Network Analysis and Mining.

[9]  Aliaksandr Barushka,et al.  Spam filtering using integrated distribution-based balancing approach and regularized deep neural networks , 2018, Applied Intelligence.

[10]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[11]  Virgílio A. F. Almeida,et al.  Detecting Spammers on Twitter , 2010 .

[12]  Wouter Weerkamp,et al.  A Framework for Unsupervised Spam Detection in Social Networking Sites , 2012, ECIR.

[13]  Víctor M. Prieto,et al.  Detecting Linkedin Spammers and its Spam Nets , 2013 .

[14]  Raymond Y. K. Lau,et al.  Who are the spoilers in social media marketing? Incremental learning of latent semantics for social spam detection , 2016, Electronic Commerce Research.

[15]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Ankit Kumar Jain,et al.  Towards Filtering of SMS Spam Messages Using Machine Learning Based Technique , 2017 .

[17]  Alex Hai Wang,et al.  Don't follow me: Spam detection in Twitter , 2010, 2010 International Conference on Security and Cryptography (SECRYPT).

[18]  Mohammad Karim Sohrabi,et al.  A Feature Selection Approach to Detect Spam in the Facebook Social Network , 2018 .

[19]  P. Santhi Thilagam,et al.  Discovering spammer communities in twitter , 2017, Journal of Intelligent Information Systems.

[20]  Jurandy Almeida,et al.  Spam filtering: how the dimensionality reduction affects the accuracy of Naive Bayes classifiers , 2011, Journal of Internet Services and Applications.

[21]  Zheyi Chen,et al.  Detecting spammers on social networks , 2015, Neurocomputing.

[22]  Aliaksandr Barushka,et al.  Spam Filtering Using Regularized Neural Networks with Rectified Linear Units , 2016, AI*IA.

[23]  P. Vigneswara Ilavarasan,et al.  Detection of Spammers in Twitter marketing: A Hybrid Approach Using Social Media Analytics and Bio Inspired Computing , 2017, Information Systems Frontiers.

[24]  Fangzhao Wu,et al.  Co-detecting social spammers and spam messages in microblogging via exploiting social contexts , 2016, Neurocomputing.

[25]  Gordon V. Cormack,et al.  Email Spam Filtering: A Systematic Review , 2008, Found. Trends Inf. Retr..

[26]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[27]  Yannis Manolopoulos,et al.  Early malicious activity discovery in microblogs by social bridges detection , 2016, 2016 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT).