Deep Belief Networks for Spam Filtering

This paper proposes a novel approach for spam filtering based on the use of Deep Belief Networks (DBNs). In contrast to conventional feedfoward neural networks having one or two hidden layers, DBNs are feedforward neural networks with many hidden layers. Until recently it was not clear how to initialize the weights of deep neural networks, which resulted in poor solutions with low generalization capabilities. A greedy layer-wise unsupervised algorithm was recently proposed to tackle this problem with successful results. In this work we present a methodology for spam detection based on DBNs and evaluate its performance on three widely used datasets. We also compare our method to Support Vector Machines (SVMs) which is the state-of-the-art method for spam filtering in terms of classification performance. Our experiments indicate that using DBNs to filter spam e-mails is a viable methodology, since they achieve similar or even better performance than SVMs on all three datasets.