论文信息 - Deep Belief Networks for Spam Filtering

Deep Belief Networks for Spam Filtering

This paper proposes a novel approach for spam filtering based on the use of Deep Belief Networks (DBNs). In contrast to conventional feedfoward neural networks having one or two hidden layers, DBNs are feedforward neural networks with many hidden layers. Until recently it was not clear how to initialize the weights of deep neural networks, which resulted in poor solutions with low generalization capabilities. A greedy layer-wise unsupervised algorithm was recently proposed to tackle this problem with successful results. In this work we present a methodology for spam detection based on DBNs and evaluate its performance on three widely used datasets. We also compare our method to Support Vector Machines (SVMs) which is the state-of-the-art method for spam filtering in terms of classification performance. Our experiments indicate that using DBNs to filter spam e-mails is a viable methodology, since they achieve similar or even better performance than SVMs on all three datasets.

Aristidis Likas | Grigorios Tzortzis | A. Likas | Grigorios Tzortzis

[1] Geoffrey E. Hinton. Reducing the Dimensionality of Data with Neural , 2008 .

[2] Thomas Hofmann,et al. Greedy Layer-Wise Training of Deep Networks , 2007 .

[3] Vangelis Metsis,et al. Spam Filtering with Naive Bayes - Which Naive Bayes? , 2006, CEAS.

[4] Georgios Paliouras,et al. Learning to Filter Unsolicited Commercial E-Mail , 2006 .

[5] Johan Hovold,et al. Naive Bayes spam filtering using word-position-based attributes and length-sensitive classification thresholds , 2005, CEAS.

[6] Fabio Roli,et al. Spam Filtering Based On The Analysis Of Text Information Embedded Into Images , 2006, J. Mach. Learn. Res..

[7] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[8] Tianshun Yao,et al. An evaluation of statistical spam filtering techniques , 2004, TALIP.