A on Spam Filtering Classification: A Majority Voting like Approach

Despite the improvement in filtering tools and informatics security, spam still cause substantial damage to public and private organizations. In this paper, we present a majority-voting based approach in order to identify spam messages. A new methodology for building majority voting classifier is presented and tested. The results using SpamAssassin dataset indicates non-negligible improvement over state of art, which paves the way for further development and applications.

[1]  Xu Zhou,et al.  A LVQ-based neural network anti-spam email approach , 2005, OPSR.

[2]  Georgios Paliouras,et al.  An evaluation of Naive Bayesian anti-spam filtering , 2000, ArXiv.

[3]  Tianshun Yao,et al.  An evaluation of statistical spam filtering techniques , 2004, TALIP.

[4]  Mojtaba Vahidi-Asl,et al.  Learn to Detect Phishing Scams Using Learning and Ensemble ?Methods , 2007, 2007 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Workshops.

[5]  Nizar Bouguila,et al.  A study of spam filtering using support vector machines , 2010, Artificial Intelligence Review.

[6]  Robert E. Mercer,et al.  Classifying Spam Emails Using Text and Readability Features , 2013, 2013 IEEE 13th International Conference on Data Mining.

[7]  Peter Willett,et al.  The Porter stemming algorithm: then and now , 2006, Program.

[8]  Blaz Zupan,et al.  Spam Filtering Using Statistical Data Compression Models , 2006, J. Mach. Learn. Res..

[9]  Karl-Michael Schneider On Word Frequency Information and Negative Evidence in Naive Bayes Text Classification , 2004, EsTAL.

[10]  David W. Opitz,et al.  Generating Accurate and Diverse Members of a Neural-Network Ensemble , 1995, NIPS.

[11]  Bogdan Gabrys,et al.  Classifier selection for majority voting , 2005, Inf. Fusion.

[12]  Grigorios Tsoumakas,et al.  Tracking recurring contexts using ensemble classifiers: an application to email filtering , 2009, Knowledge and Information Systems.

[13]  Vangelis Metsis,et al.  Spam Filtering with Naive Bayes - Which Naive Bayes? , 2006, CEAS.

[14]  David Madigan,et al.  On the Naive Bayes Model for Text Categorization , 2003, AISTATS.

[15]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[16]  Andrew McCallum,et al.  A comparison of event models for naive bayes text classification , 1998, AAAI 1998.

[17]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[18]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.