An Effective Ensemble Approach for Spam Classification

The annoyance of spam increasingly plagues both individuals and organizations. Spam classification is an important issue to distinguish the spam with the legitimate email or address. This paper presents a neural network ensemble approach based on a specially designed cooperative coevolution paradigm. Each component network corresponds to a separate subpopulation and all subpopulations are evolved simultaneously. The ensemble performance and the Q-statistic diversity measure are adopted as the objectives, and the component networks are evaluated by using the multi-objective Pareto optimality measure. Experimental results illustrate that the proposed algorithm outperforms the traditional ensemble methods on the spam classification problems. Key word: neural network ensemble, cooperative coevolution, spam classification, web service

[1]  Georgios Paliouras,et al.  An evaluation of Naive Bayesian anti-spam filtering , 2000, ArXiv.

[2]  Ioannis G. Tsoulos,et al.  Neural Recognition and Genetic Features Selection for Robust Detection of E-Mail Spam , 2006, SETN.

[3]  Geoffrey I. Webb,et al.  MultiBoosting: A Technique for Combining Boosting and Wagging , 2000, Machine Learning.

[4]  Jordan B. Pollack,et al.  Pareto Optimality in Coevolutionary Learning , 2001, ECAL.

[5]  Michael R. Berthold,et al.  Boosting the Performance of RBF Networks with Dynamic Decay Adjustment , 1994, NIPS.

[6]  Zhao Wei-xiang RBFN Structure Determination Strategy Based on PLS and GAs , 2002 .

[7]  Ron Kohavi,et al.  Bias Plus Variance Decomposition for Zero-One Loss Functions , 1996, ICML.

[8]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[9]  Chih-Hung Wu,et al.  Robust classification for spam filtering by back-propagation neural networks using behavior-based features , 2009, Applied Intelligence.

[10]  Leo Breiman,et al.  Stacked regressions , 2004, Machine Learning.

[11]  Tatsuo Higuchi,et al.  Evolutionary learning of nearest-neighbor MLP , 1996, IEEE Trans. Neural Networks.

[12]  César Hervás-Martínez,et al.  Cooperative coevolution of artificial neural network ensembles for pattern classification , 2005, IEEE Transactions on Evolutionary Computation.

[13]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[14]  Judy Kay,et al.  Automatic Induction of Rules of e-mail Classification , 2001 .

[15]  Otávio Augusto S. Carpinteiro,et al.  A Neural Model in Anti-spam Systems , 2006, ICANN.

[16]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[17]  Chih-Ping Wei,et al.  Effective spam filtering: A single-class learning and ensemble approach , 2008, Decis. Support Syst..

[18]  Rich Caruana,et al.  Ensemble selection from libraries of models , 2004, ICML.

[19]  Anirban Mondal,et al.  On Effective E-mail Classification via Neural Networks , 2005, DEXA.

[20]  Hao Xu,et al.  Automatic thesaurus construction for spam filtering using revised back propagation neural network , 2010, Expert Syst. Appl..

[21]  Ludmila I. Kuncheva,et al.  Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy , 2003, Machine Learning.

[22]  Xin Yao,et al.  Evolving a cooperative population of neural networks by minimizing mutual information , 2001, Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546).

[23]  Xin Yao,et al.  Ensemble learning via negative correlation , 1999, Neural Networks.

[24]  Ian H. Witten,et al.  Stacking Bagged and Dagged Models , 1997, ICML.

[25]  Wei Tang,et al.  Ensembling neural networks: Many could be better than all , 2002, Artif. Intell..

[26]  Shih-Wei Lin,et al.  An ensemble approach applied to classify spam e-mails , 2010, Expert Syst. Appl..

[27]  Nizar Bouguila,et al.  A study of spam filtering using support vector machines , 2010, Artificial Intelligence Review.

[28]  Anders Krogh,et al.  Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[29]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.