Spam filtering with abductive networks

Spam messages pose a major threat to the usability of electronic mail. Spam wastes time and money for network users and administrators, consumes network bandwidth and storage space, and slows down email servers. In addition, it provides a medium to distribute harmful code and/or offensive content. In this paper, we investigate the application of abductive learning in filtering out spam messages. We study the performance for various network models on the spambase dataset. Results reveal that classification accuracies of 91.7% can be achieved using only 10 out of the available 57 content attributes. The attributes are selected automatically by the abductive learning algorithm as the most effective feature subset, thus achieving approximately 6:1 data reduction. Comparison with other techniques such as multi-layer perceptrons and naive Bayesian classifiers show that the abductive learning approach can provide better spam detection accuracies, e.g. false positive rates as low as 5.9% while requiring much shorter training times.

[1]  Lluís Màrquez i Villodre,et al.  Boosting Trees for Anti-Spam Email Filtering , 2001, ArXiv.

[2]  G. Manning The use of the DAP, a massively parallel computing system, for information retrieval and processing , 1989 .

[3]  Susan T. Dumais,et al.  A Bayesian Approach to Filtering Junk E-Mail , 1998, AAAI 1998.

[4]  Nathaniel S. Borenstein,et al.  A Multifaceted Approach to Spam Reduction , 2004, CEAS.

[5]  El-Sayed M. El-Alfy,et al.  Construction and analysis of educational tests using abductive machine learning , 2008, Comput. Educ..

[6]  Hongyuan Zha,et al.  Exploring Support Vector Machines and Random Forests for Spam Detection , 2004, CEAS.

[7]  Georgios Paliouras,et al.  A Memory-Based Approach to Anti-Spam Filtering for Mailing Lists , 2004, Information Retrieval.

[8]  Irena Koprinska,et al.  A neural network based approach to automated e-mail classification , 2003, Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003).

[9]  Ray Hunt,et al.  Tightening the net: A review of current and next generation spam filtering tools , 2006, Comput. Secur..

[10]  El-Sayed M. El-Alfy,et al.  A fuzzy similarity approach for automated spam filtering , 2008, 2008 IEEE/ACS International Conference on Computer Systems and Applications.

[11]  Chih-Chien Wang Sender and Receiver Addresses as Cues for Anti-Spam Filtering , 2004, J. Res. Pract. Inf. Technol..

[12]  Bogdan Hoanca,et al.  How good are our weapons in the spam wars? , 2006, IEEE Technology and Society Magazine.

[13]  Yue Yang,et al.  Anti-Spam Filtering Using Neural Networks and Baysian Classifiers , 2007, 2007 International Symposium on Computational Intelligence in Robotics and Automation.

[14]  Keith C. Drake,et al.  Abductive reasoning networks , 1991, Neurocomputing.