E-Mail Spam Filtering: A Review of Techniques and Trends

We present an inclusive review of recent and successful content-based e-mail spam filtering techniques. Our focus is mainly on machine learning-based spam filters and variants inspired from them. We report on relevant ideas, techniques, taxonomy, major efforts, and the state-of-the-art in the field. The initial interpretation of the prior work examines the basics of e-mail spam filtering and feature engineering. We conclude by studying techniques, evaluation benchmarks, and explore the promising offshoots of latest developments and suggest lines of future investigations.

[1]  Sarah Jane Delany,et al.  Feature based and feature free textual CBR: a comparison in spam filtering , 2006 .

[2]  Li Zhang,et al.  Decision Tree Support Vector Machine , 2007, Int. J. Artif. Intell. Tools.

[3]  Enrico Blanzieri,et al.  A survey of learning-based techniques of email spam filtering , 2008, Artificial Intelligence Review.

[4]  Juan M. Corchado,et al.  A Comparative Performance Study of Feature Selection Methods for the Anti-spam Filtering Domain , 2006, ICDM.

[5]  Izzat Alsmadi,et al.  Clustering and classification of email contents , 2015, J. King Saud Univ. Comput. Inf. Sci..

[6]  Walmir M. Caminhas,et al.  A review of machine learning approaches to Spam filtering , 2009, Expert Syst. Appl..

[7]  Ray Hunt,et al.  Tightening the net: A review of current and next generation spam filtering tools , 2006, Comput. Secur..

[8]  Efstathios Stamatatos,et al.  Words versus Character n-Grams for Anti-Spam Filtering , 2007, Int. J. Artif. Intell. Tools.

[9]  Georgios Paliouras,et al.  Learning to Filter Unsolicited Commercial E-Mail , 2006 .

[10]  Tianshun Yao,et al.  An evaluation of statistical spam filtering techniques , 2004, TALIP.

[11]  Konstantin Tretyakov,et al.  Machine Learning Techniques in Spam Filtering , 2004 .

[12]  Chi-Yuan Yeh,et al.  Effective spam classification based on meta-heuristics , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[13]  Georgios Paliouras,et al.  Stacking Classifiers for Anti-Spam Filtering of E-Mail , 2001, EMNLP.

[14]  Irena Koprinska,et al.  Learning to classify e-mail , 2007, Inf. Sci..

[15]  Calton Pu,et al.  A study on evolution of email spam over fifteen years , 2013, CollaborateCom 2013.

[16]  Manish K Thakur,et al.  Intelligent Mail Box , 2016 .

[17]  Wenjuan Li,et al.  An empirical study on email classification using supervised machine learning in real environments , 2015, 2015 IEEE International Conference on Communications (ICC).

[18]  Hongjun Lu,et al.  A Comparative Study of Classification Based Personal E-mail Filtering , 2000, PAKDD.

[19]  Payal Prajapati,et al.  A survey and evaluation of supervised machine learning techniques for spam e-mail filtering , 2015, 2015 IEEE International Conference on Electrical, Computer and Communication Technologies (ICECCT).

[20]  Gordon V. Cormack,et al.  Email Spam Filtering: A Systematic Review , 2008, Found. Trends Inf. Retr..