Genetic optimized artificial immune system in spam detection: a review and a model

Spam is a serious universal problem which causes problems for almost all computer users. This issue affects not only normal users of the internet, but also causes a big problem for companies and organizations since it costs a huge amount of money in lost productivity, wasting users’ time and network bandwidth. Many studies on spam indicate that spam cost organizations billions of dollars yearly. This work presents a machine learning method inspired by the human immune system called Artificial Immune System (AIS) which is a new emerging method that still needs further exploration. Core modifications were applied on the standard AIS with the aid of the Genetic Algorithm. Also an Artificial Neural Network for spam detection is applied with a new manner. SpamAssassin corpus is used in all our simulations.

[1]  N. Soonthornphisaj,et al.  Anti-spam filtering: a centroid-based classification approach , 2002, 6th International Conference on Signal Processing, 2002..

[2]  Xu Zhou,et al.  A LVQ-based neural network anti-spam email approach , 2005, OPSR.

[3]  Julia Itskevitch AUTOMATIC HIERARCHICAL E-MAIL CLASSIFICATION USING ASSOCIATION RULES , 2001 .

[4]  Sarah Jane Delany Using Case-Based Reasoning for Spam Filtering , 2006 .

[5]  C. Janeway,et al.  Innate immune recognition and control of adaptive immune responses. , 1998, Seminars in immunology.

[6]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[7]  Georgios Paliouras,et al.  Learning to Filter Spam E-Mail: A Comparison of a Naive Bayesian and a Memory-Based Approach , 2000, ArXiv.

[8]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[9]  Arlindo L. Oliveira,et al.  An Empirical Comparison of Text Categorization Methods , 2003, SPIRE.

[10]  Alex Alves Freitas,et al.  AISEC: an artificial immune system for e-mail classification , 2003, IEEE Congress on Evolutionary Computation.

[11]  Jonathan Timmis,et al.  Artificial immune systems - a new computational intelligence paradigm , 2002 .

[12]  Jean-Yves Le Boudec,et al.  Artificial Immune System for Collaborative Spam Filtering , 2007, NICSO.

[13]  R. Gershon,et al.  "Clonal selection and after," and after. , 1979, The New England journal of medicine.

[14]  C. Colaco,et al.  Acquired wisdom in innate immunity. , 1998, Immunology today.

[15]  C. van den Dool,et al.  When three is not a crowd: a Crossregulation Model of the dynamics and repertoire selection of regulatory CD4+ T cells , 2007, Immunological reviews.

[16]  Tianshun Yao,et al.  An evaluation of statistical spam filtering techniques , 2004, TALIP.

[17]  P. Deepak,et al.  Spam filtering using spam mail communities , 2005, The 2005 Symposium on Applications and the Internet.

[18]  Jon Postel,et al.  On the junk mail problem , 1975, RFC.

[19]  J. Neidhoefer,et al.  Immunized Adaptive Critic for an Autonomous Aircraft Control Application , 1999 .

[20]  Vladimir Cherkassky,et al.  The Nature Of Statistical Learning Theory , 1997, IEEE Trans. Neural Networks.

[21]  Csaba Gulyás Creation of a Bayesian network-based meta spam filter , using the analysis of different spam filters , 2006 .

[22]  Xin Yao,et al.  Evolving artificial neural networks , 1999, Proc. IEEE.

[23]  Ben Medlock,et al.  A Language Model Approach to Spam Filtering , 2004 .

[24]  Barton C. Massey,et al.  Learning Spam: Simple Techniques For Freely-Available Software , 2003, USENIX Annual Technical Conference, FREENIX Track.

[25]  Mark James Neal,et al.  Meta-stable Memory in an Artificial Immune Network , 2003, ICARIS.

[26]  Tony White,et al.  Developing an Immunity to Spam , 2003, GECCO.

[27]  José María Gómez Hidalgo,et al.  Content based SMS spam filtering , 2006, DocEng '06.

[28]  Jacques Periaux,et al.  Genetic Algorithms in Engineering and Computer Science , 1996 .

[29]  Vasant Honavar,et al.  Evolutionary Design of Neural Architectures , 1995 .

[30]  Larry R. Medsker,et al.  Genetic Algorithms and Neural Networks , 1995 .

[31]  Ajith Abraham,et al.  Artificial immune system inspired behavior-based anti-spam filter , 2007, Soft Comput..

[32]  Uwe Aickelin,et al.  Cooperative Automated Worm Response and Detection ImmuNe ALgorithm(CARDINAL) Inspired by T-Cell Immunity and Tolerance , 2005, ICARIS.

[33]  D. Wong,et al.  Negative Selection Algorithm for Aircraft Fault Detection , 2004, ICARIS.

[34]  Marco Dorigo,et al.  Optimization, Learning and Natural Algorithms , 1992 .

[35]  C. Janeway,et al.  Innate Immunity: The Virtues of a Nonclonal System of Recognition , 1997, Cell.

[36]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[37]  Sung-Hyuk Cha,et al.  A Neural Network Classifier for Junk E-Mail , 2004, Document Analysis Systems.

[38]  Martin T. Hagan,et al.  Neural network design , 1995 .

[39]  Katsuyuki Yamazaki,et al.  Density-based spam detector , 2004, IEICE Trans. Inf. Syst..

[40]  Suku Nair,et al.  A comparison of machine learning techniques for phishing detection , 2007, eCrime '07.

[41]  Konstantin Tretyakov,et al.  Machine Learning Techniques in Spam Filtering , 2004 .

[42]  Shi Bing,et al.  Inductive learning algorithms and representations for text categorization , 2006 .

[43]  Carl Vogel,et al.  Spam filters: bayes vs. chi-squared; letters vs. words , 2003, ISICT.

[44]  R. Locksley,et al.  The Instructive Role of Innate Immunity in the Acquired Immune Response , 1996, Science.

[45]  Geoff Hulten,et al.  Filtering spam e-mail on a global scale , 2004, WWW Alt. '04.

[46]  C. Janeway,et al.  The immune system evolved to discriminate infectious nonself from noninfectious self. , 1992, Immunology today.

[47]  Serge Gauthronet,et al.  Unsolicited commercial communications and data protection , 2001 .

[48]  Christopher Meek,et al.  Challenges of the Email Domain for Text Classification , 2000, ICML.

[49]  Patrick Pantel,et al.  SpamCop: A Spam Classification & Organisation Program , 1998, AAAI 1998.

[50]  Nello Cristianini,et al.  Support Vector Machines and Kernel Methods: The New Generation of Learning Machines , 2002, AI Mag..

[51]  Le Zhang,et al.  Filtering Junk Mail with a Maximum Entropy Model , 2003 .

[52]  Tony White,et al.  Immunity from Spam: An Analysis of an Artificial Immune System for Junk Email Detection , 2005, ICARIS.

[53]  Herbert A. Simon,et al.  WHY SHOULD MACHINES LEARN , 1983 .

[54]  William S. Yerazunis,et al.  Spam filtering using a Markov random field model with variable weighting schemas , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[55]  Ernesto Damiani,et al.  An Open Digest-based Technique for Spam Detection , 2004, PDCS.

[56]  Constantine D. Spyropoulos,et al.  An experimental comparison of naive Bayesian and keyword-based anti-spam filtering with personal e-mail messages , 2000, SIGIR '00.

[57]  William W. Cohen Learning Rules that Classify E-Mail , 1996 .

[58]  Harris Drucker,et al.  Support vector machines for spam categorization , 1999, IEEE Trans. Neural Networks.

[59]  Alison Gillaspy SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE , 2008 .

[60]  Tao Tao,et al.  Transductive link spam detection , 2007, AIRWeb '07.

[61]  C. Parish,et al.  Dependence of the adaptive immune response on innate immunity: Some questions answered but new paradoxes emerge , 1997, Immunology and cell biology.

[62]  Tom M. Mitchell,et al.  Machine Learning and Data Mining , 2012 .

[63]  Stephanie Forrest,et al.  Infect Recognize Destroy , 1996 .

[64]  Jonathan Timmis,et al.  Artificial immune systems as a novel soft computing paradigm , 2003, Soft Comput..

[65]  J Timmis,et al.  An artificial immune system for data analysis. , 2000, Bio Systems.

[66]  Susan T. Dumais,et al.  A Bayesian Approach to Filtering Junk E-Mail , 1998, AAAI 1998.

[67]  Thorsten Joachims,et al.  A statistical learning learning model of text classification for support vector machines , 2001, SIGIR '01.

[68]  C. Janeway Immunobiology: The Immune System in Health and Disease , 1996 .

[69]  Peter J. Bentley,et al.  Towards an artificial immune system for network intrusion detection: an investigation of dynamic clonal selection , 2002, Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600).

[70]  Pedro M. Domingos,et al.  Adversarial classification , 2004, KDD.

[71]  Luis Mateus Rocha,et al.  Adaptive Spam Detection Inspired by the Immune System , 2008, ALIFE.

[72]  Joshua Alspector,et al.  SVM-based Filtering of E-mail Spam with Content-specic Misclassication Costs , 2001 .

[73]  Sriram Pogula Sridhar Developing neural network applications using LabVIEW , 2005 .

[74]  Ali Çiltik TIME EFFICIENT SPAM E-MAIL FILTERING FOR TURKISH , 2006 .

[75]  M. Carroll,et al.  Linkages of innate and adaptive immunity. , 1998, Current opinion in immunology.

[76]  Jean Cocteau,et al.  Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate , 2004 .

[77]  Georgios Paliouras,et al.  An evaluation of Naive Bayesian anti-spam filtering , 2000, ArXiv.

[78]  Stephanie Forrest,et al.  Revisiting LISYS: parameters and normal behavior , 2002, Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600).

[79]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[80]  Jonathan Timmis,et al.  Artificial Immune Systems: A New Computational Intelligence Approach , 2003 .

[81]  S. Tonegawa,et al.  Somatic generation of antibody diversity. , 1976, Nature.

[82]  Luca Maria Gambardella,et al.  Ant Algorithms for Discrete Optimization , 1999, Artificial Life.

[83]  Kevin R. Gee Using latent semantic indexing to filter spam , 2003, SAC '03.

[84]  Nathan Dimmock,et al.  Peer-to-peer collaborative spam detection , 2004, CROS.

[85]  Andrew M. Tyrrell,et al.  Immunotronics: Hardware Fault Tolerance Inspired by the Immune System , 2000, ICES.

[86]  Levent Özgür,et al.  Adaptive anti-spam filtering for agglutinative languages: a special case for Turkish , 2004, Pattern Recognit. Lett..

[87]  Irena Koprinska,et al.  A neural network based approach to automated e-mail classification , 2003, Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003).

[88]  Dipankar Dasgupta,et al.  Novelty detection in time series data using ideas from immunology , 1996 .

[89]  Karl-Michael Schneider,et al.  A Comparison of Event Models for Naive Bayes Anti-Spam E-Mail Filtering , 2003, EACL.

[90]  Dipankar Dasgupta,et al.  An immunogenetic approach in chemical spectrum recognition , 2003 .

[91]  Stephanie Forrest,et al.  Architecture for an Artificial Immune System , 2000, Evolutionary Computation.

[92]  Jalal Almhana,et al.  Adaptive filtering of spam , 2004, Proceedings. Second Annual Conference on Communication Networks and Services Research, 2004..

[93]  D. Dasgupta,et al.  Advances in artificial immune systems , 2006, IEEE Computational Intelligence Magazine.

[94]  Roberto J. Bayardo,et al.  Athena: Mining-Based Interactive Management of Text Database , 2000, EDBT.

[95]  C. Janeway How the immune system recognizes invaders. , 1993, Scientific American.

[96]  Nan Zhang,et al.  Incremental Immune-Inspired Clustering Approach to Behavior-Based Anti-Spam Technology , 2006 .

[97]  Salvatore J. Stolfo,et al.  Behavior-based email analysis with application to spam detection , 2006 .

[98]  F. Burnet The clonal selection theory of acquired immunity , 1959 .

[99]  Simon M. Garrett,et al.  How Do We Evaluate Artificial Immune Systems? , 2005, Evolutionary Computation.

[100]  C. Janeway,et al.  Innate immunity: impact on the adaptive immune response. , 1997, Current opinion in immunology.

[101]  Leandro Nunes de Castro,et al.  An Overview of Artificial Immune Systems , 2004 .

[102]  W. M. Jenkins,et al.  Genetic Algorithms and Neural Networks , 1999, Neural Networks in the Analysis and Design of Structures.

[103]  Gordon V. Cormack,et al.  Spam and the ongoing battle for the inbox , 2007, CACM.

[104]  Leandro Nunes de Castro,et al.  ARTIFICIAL IMMUNE SYSTEMS: PART II - A SURVEY OF APPLICATIONS , 2000 .

[105]  Ray Hunt,et al.  Tightening the net: A review of current and next generation spam filtering tools , 2006, Comput. Secur..

[106]  Lu Xianliang,et al.  A LVQ-based neural network anti-spam email approach , 2005 .

[107]  Joel Scanlan,et al.  Catching spam before it arrives: domain specific dynamic blacklists , 2006, ACSW.

[108]  Ahmed Khorsi,et al.  An Overview of Content-Based Spam Filtering Techniques , 2007, Informatica.

[109]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[110]  D. Puniškis,et al.  An Artificial Neural Nets for Spam e-mail Recognition , 2006 .