An Intelligent Spam Detection Model Based on Artificial Immune System

Spam emails, also known as non-self, are unsolicited commercial or malicious emails, sent to affect either a single individual or a corporation or a group of people. Besides advertising, these may contain links to phishing or malware hosting websites set up to steal confidential information. In this paper, a study of the effectiveness of using a Negative Selection Algorithm (NSA) for anomaly detection applied to spam filtering is presented. NSA has a high performance and a low false detection rate. The designed framework intelligently works through three detection phases to finally determine an email’s legitimacy based on the knowledge gathered in the training phase. The system operates by elimination through Negative Selection similar to the functionality of T-cells’ in biological systems. It has been observed that with the inclusion of more datasets, the performance continues to improve, resulting in a 6% increase of True Positive and True Negative detection rate while achieving an actual detection rate of spam and ham of 98.5%. The model has been further compared against similar studies, and the result shows that the proposed system results in an increase of 2 to 15% in the correct detection rate of spam and ham.

[1]  Ganthan Narayana Samy,et al.  Heuristic systematic model based guidelines for phishing victims , 2016, 2016 IEEE Annual India Conference (INDICON).

[2]  Blaz Zupan,et al.  Spam Filtering Using Statistical Data Compression Models , 2006, J. Mach. Learn. Res..

[3]  Ali Shafigh Aski,et al.  Proposed efficient algorithm to filter spam using machine learning techniques , 2016 .

[4]  Ismaila Idris,et al.  Model and Algorithm in Artificial Immune System for Spam Detection , 2012 .

[5]  Li Zhang,et al.  Detection of phishing emails using data mining algorithms , 2015, 2015 9th International Conference on Software, Knowledge, Information Management and Applications (SKIMA).

[6]  Yingjie Zhou,et al.  Strategies for Cleaning Organizational Emails with an Application to Enron Email Dataset , 2007 .

[7]  Kheng Cher Yeo,et al.  Critical review of machine learning approaches to apply big data analytics in DDoS forensics , 2018, 2018 International Conference on Computer Communication and Informatics (ICCCI).

[8]  Thar Baker,et al.  Context Mining of Sedentary Behaviour for Promoting Self-Awareness Using a Smartphone † , 2018, Sensors.

[9]  Bilal Bahaa Zaidan,et al.  Impact of spam advertisement through e-mail: A study to assess the influence of the anti-spam on the e-mail marketing , 2010 .

[10]  Islam A. T. F. Taj-Eddin,et al.  Intelligent Word-Based Spam Filter Detection Using Multi-Neural Networks , 2013 .

[11]  Florentino Fernández Riverola,et al.  Using evolutionary computation for discovering spam patterns from e-mail samples , 2018, Inf. Process. Manag..

[12]  Danny Hendler,et al.  Early detection of spamming accounts in large-Scale service provider networks , 2017, Knowl. Based Syst..

[13]  Nauman Aslam,et al.  Detection of online phishing email using dynamic evolving neural network based on reinforcement learning , 2018, Decis. Support Syst..

[14]  Thar Baker,et al.  White-Hat Hacking Framework for Promoting Security Awareness , 2016, 2016 8th IFIP International Conference on New Technologies, Mobility and Security (NTMS).

[15]  Bharanidharan Shanmugam,et al.  Using blockchain technology for file synchronization , 2019 .

[16]  Ali Selamat,et al.  A Swarm Negative Selection Algorithm for Email Spam Detection , 2015 .

[17]  Markus Jakobsson,et al.  Social phishing , 2007, CACM.

[18]  Izzat Alsmadi,et al.  Clustering and classification of email contents , 2015, J. King Saud Univ. Comput. Inf. Sci..

[19]  Yehuda Lindell,et al.  Text Mining at the Term Level , 1998, PKDD.

[20]  Dawn Xiaodong Song,et al.  Clickjacking Revisited: A Perceptual View of UI Security , 2014, WOOT.

[21]  Igor Santos,et al.  Study on the effectiveness of anomaly detection for spam filtering , 2014, Inf. Sci..

[22]  Nasrullah Memon,et al.  Detection of Fraudulent Emails by Employing Advanced Feature Abundance , 2014 .

[23]  Salim Chikhi,et al.  Clustered negative selection algorithm and fruit fly optimization for email spam detection , 2017, Journal of Ambient Intelligence and Humanized Computing.

[24]  Jason Brownlee,et al.  Clever Algorithms: Nature-Inspired Programming Recipes , 2012 .

[25]  Ying Tan,et al.  Extracting discriminative information from e-mail for spam detection inspired by Immune System , 2010, IEEE Congress on Evolutionary Computation.

[26]  Emmanuel S. Pilli,et al.  Forensic analysis of E-mail address spoofing , 2014, 2014 5th International Conference - Confluence The Next Generation Information Technology Summit (Confluence).

[27]  Asif Karim,et al.  An Overview of Blockchain Applications and Attacks , 2019, 2019 International Conference on Vision Towards Emerging Trends in Communication and Networking (ViTECoN).

[28]  Thar Baker,et al.  BotDet: A System for Real Time Botnet Command and Control Traffic Detection , 2018, IEEE Access.

[29]  Ali Selamat,et al.  Hybrid email spam detection model with negative selection algorithm and differential evolution , 2014, Eng. Appl. Artif. Intell..

[30]  Hyeran Byun,et al.  Applications of Support Vector Machines for Pattern Recognition: A Survey , 2002, SVM.

[31]  Azadeh Shakery,et al.  Content-based concept drift detection for Email spam filtering , 2010, 2010 5th International Symposium on Telecommunications.

[32]  梁仲文.,et al.  An analysis of the impact of phishing and anti-phishing related announcements on market value of global firms , 2009 .

[33]  Wanli Ma,et al.  A Novel Spam Email Detection System Based on Negative Selection , 2009, 2009 Fourth International Conference on Computer Sciences and Convergence Information Technology.