论文信息 - Spam Mail Detection Using Data Mining: A Comparative Analysis

Spam Mail Detection Using Data Mining: A Comparative Analysis

In the era of digitization communication, the commercial transaction takes place through the Web, email and be one of the most authoritative and fastest forms of communication, email fame has led to the scratchy email spam upload. The ensuing increase in superfluous and unsolicited spam received through email not only increases network communication and memory space, but it also becomes severe security intimidation for the end user. Automatic spam filtering is a promising and worthy research area where extensive works have been reported for the cataloging of email spam, but none of the methodologies guarantees complete solutions. Due to the rapid expansion of digital data, knowledge discovery and data mining have engrossed much attention with an imminent need to turn that data into useful information and knowledge. In this paper, authors have focused on how email communications are affected by spam and focus on various classification-based data mining techniques in a spam data set to identify spam and ham to analyze the performance of all classifiers and identify the best classifiers in terms of performance. To carry out the purpose of the work, an open source WEKA data mining tool has used to explore the performance analysis of the different classifiers and finally the superlative classifier has identified for the classification of email spam and has developed the knowledge flow model.

Suparna DasGupta | Soumyabrata Saha | Suman Kumar Das

[1] Rasim M. Alguliyev,et al. Classification of Textual E-Mail Spam Using Data Mining Techniques , 2011, Appl. Comput. Intell. Soft Comput..

[2] Yue Yang,et al. Anti-Spam Filtering Using Neural Networks and Baysian Classifiers , 2007, 2007 International Symposium on Computational Intelligence in Robotics and Automation.

[3] Deepak Kapgate,et al. A Review on Detection of Web Based Attacks using Data Mining Techniques , 2013 .

[4] Stephen E. Fienberg,et al. Bayesian Mixed Membership Models for Soft Clustering and Classification , 2004, GfKl.

[5] Victoria Bellotti,et al. E-mail as habitat: an exploration of embedded personal information management , 2001, INTR.

[6] Donghai Guan,et al. SMS Classification Based on Naïve Bayes Classifier and Apriori Algorithm Frequent Itemset , 2014 .

[7] Miguel Rio,et al. Symbiotic Data Mining for Personalized Spam Filtering , 2009, 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology.