Heterogeneous classifier model for E-mail spam classification using FSO feature selection method

In this Computer world, E-mail is one of the popular modes of communication due to its easy accessibility and low cost. Due to the advantages of time, speed and cost effectiveness, a lot of people use it for commercial advertisement purposes resulting in unnecessary e-mails at user inboxes called spam. Spam is the unnecessary and unwanted commercial e-mail. It is also known as junk e-mail. It is sending unnecessary e-mail message with profit-making data to in discriminated group of recipients. It is waste of storage space, time, and network bandwidth. E-mail classifier classifies the group of mails into ham and spam based on its data content. E-mail classifications system, which clean the spam e-mails from inbox, move it to the spam folder. The proposed e-mail classification system includes two stages, such as training stage and testing stage. Initial stage, input e-mail message is sent to the feature selection module to pick the suitable feature for spam classification. In this paper, firefly and GSO algorithm is efficiently combined to pick the appropriate features from the big dimensional area using correlation. Once the finest feature space is determined through FSO algorithm, the E-mail classification is accomplished using weighted based majority voting system. The classifiers applied for classifying e-mails are naive bayes algorithm, neural networks and decision tree. The UCI spambase dataset is utilized for e-mail spam classification. The research result validation of the proposed technique is made through evaluation metrics such as, precision, recall and accuracy.

[1]  Tianshun Yao,et al.  An evaluation of statistical spam filtering techniques , 2004, TALIP.

[2]  Ung Mo Kim,et al.  A hierarchical framework for content-based image spam filtering , 2012, 2012 8th International Conference on Information Science and Digital Content Technology (ICIDT2012).

[3]  Wanlei Zhou,et al.  Dynamic Feature Selection for Spam Filtering Using Support Vector Machine , 2007, 6th IEEE/ACIS International Conference on Computer and Information Science (ICIS 2007).

[4]  Karl Aberer,et al.  Automatic Expansion of Manual Email Classifications Based on Text Analysis , 2003, OTM.

[5]  Wanlei Zhou,et al.  An Innovative Spam Filtering Model Based on Support Vector Machine , 2005, International Conference on Computational Intelligence for Modelling, Control and Automation and International Conference on Intelligent Agents, Web Technologies and Internet Commerce (CIMCA-IAWTIC'06).

[6]  Florentino Fernández Riverola,et al.  Rough sets for spam filtering: Selecting appropriate decision rules for boundary e-mail classification , 2012, Appl. Soft Comput..

[7]  D. Karthika Renuka,et al.  Blending Firefly and Bayes Classifier for Email Spam Classification , 2013 .

[8]  Lluís Màrquez i Villodre,et al.  Boosting Trees for Anti-Spam Email Filtering , 2001, ArXiv.

[9]  Rasim M. Alguliyev,et al.  Classification of Textual E-Mail Spam Using Data Mining Techniques , 2011, Appl. Comput. Intell. Soft Comput..

[10]  Mikko T. Siponen,et al.  Effective Anti-Spam Strategies in Companies: An International Study , 2006, Proceedings of the 39th Annual Hawaii International Conference on System Sciences (HICSS'06).