A Novel Approach towards Image Spam Classification

—The volume of unsolicited commercial mails has grown extremely in the past few years because of increased internet users. This unsolicited mails termed as spam occupy large storage space and bandwidth. Therefore designing an efficient spam filter is a challenging issue ahead for the future generation. Here we use gradient histogram is a key feature that can be exploited to improve the categorization capability. The gradient values are valuated for each pixel of an image. These obtained features are then normalized for efficient spam classification. These normalized features are then applied as input for feed forward back propagation neural network (BPNN) model. The experiments are conducted for different training/testing rule for the BPNN. The performance measure in terms of accuracy is determined for the proposed method using various training rule of BPNN.

[1]  Mark Dredze,et al.  Learning Fast Classifiers for Image Spam , 2007, CEAS.

[2]  Chia-Hui Lin,et al.  Near-Duplicate Mail Detection Based on URL Information for Spam Filtering , 2006, ICOIN.

[3]  Qiang Chen,et al.  Computer intrusion detection through EWMA for autocorrelated and uncorrelated data , 2003, IEEE Trans. Reliab..

[4]  Anirban Mondal,et al.  On Effective E-mail Classification via Neural Networks , 2005, DEXA.

[5]  Sung-Hyuk Cha,et al.  A Neural Network Classifier for Junk E-Mail , 2004, Document Analysis Systems.

[6]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[7]  Kwang-Ting Cheng,et al.  Using visual features for anti-spam filtering , 2005, IEEE International Conference on Image Processing 2005.

[8]  Daniel Gooch,et al.  Communications of the ACM , 2011, XRDS.

[9]  James A. Herson,et al.  Image analysis for efficient categorization of image-based spam e-mail , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[10]  Kang Li,et al.  Fast statistical spam filter by approximate classifications , 2006, SIGMETRICS '06/Performance '06.

[11]  F. Perronnin,et al.  Local gradient histogram features for word spotting in unconstrained handwritten documents , 2008 .

[12]  Zhe Wang,et al.  Filtering Image Spam with Near-Duplicate Detection , 2007, CEAS.

[13]  Fabio Roli,et al.  Spam Filtering Based On The Analysis Of Text Information Embedded Into Images , 2006, J. Mach. Learn. Res..

[14]  Wei-bang Chen,et al.  A Multimodal Data Mining Framework for Revealing Common Sources of Spam Images , 2009, J. Multim..

[15]  Jeremy Blosser,et al.  Scalable Centralized Bayesian Spam Mitigation with Bogofilter , 2004 .

[16]  Ming Yang,et al.  Image spam hunter , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[17]  Fabio Roli,et al.  Image Spam Filtering by Content Obscuring Detection , 2007, CEAS.

[18]  Juan M. Corchado,et al.  Applying lazy learning algorithms to tackle concept drift in spam filtering , 2007, Expert Syst. Appl..

[19]  Tu Minh Phuong,et al.  An Efficient Method for Filtering Image-Based Spam E-mail , 2007, CAIP.