Modeling Suspicious Email Detection using Enhanced Feature Selection

The paper presents a suspicious email detection model which incorporates enhanced feature selection. In the paper we proposed the use of feature selection strategies along with classification technique for terrorists email detection. The presented model focuses on the evaluation of machine learning algorithms such as decision tree (ID3), logistic regression, Na\"ive Bayes (NB), and Support Vector Machine (SVM) for detecting emails containing suspicious content. In the literature, various algorithms achieved good accuracy for the desired task. However, the results achieved by those algorithms can be further improved by using appropriate feature selection mechanisms. We have identified the use of a specific feature selection scheme that improves the performance of the existing algorithms.

[1]  S. Appavu alias Balamurugan,et al.  Association Rule Mining for Suspicious Email Detection: A Data Mining Approach , 2007, 2007 IEEE Intelligence and Security Informatics.

[2]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[3]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[4]  Guoqiang Peter Zhang,et al.  Neural networks for classification: a survey , 2000, IEEE Trans. Syst. Man Cybern. Part C.

[5]  José Francisco Martínez Trinidad,et al.  Taking Advantage of Class-Specific Feature Selection , 2009, IDEAL.

[6]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[7]  Thorsten Joachims,et al.  A Statistical Learning Model of Text Classification for Support Vector Machines. , 2001, SIGIR 2002.

[8]  Mjh Lim,et al.  Computational intelligence in E-mail trafficanalysis , 2008 .

[9]  S. Appavu alias Balamurugan,et al.  Suspicious E-mail Detection via Decision Tree: A Data Mining Approach , 2007, J. Comput. Inf. Technol..

[10]  Abhimanyu Das,et al.  Algorithms for subset selection in linear regression , 2008, STOC.

[11]  José Manuel Benítez,et al.  Empirical Study of Feature Selection Methods in Classification , 2008, 2008 Eighth International Conference on Hybrid Intelligent Systems.

[12]  D. Karthika Renuka,et al.  Email classification for Spam Detection using Word Stemming , 2010 .

[13]  Haiying Tu,et al.  Detecting, tracking, and counteracting terrorist networks via hidden Markov models , 2004, 2004 IEEE Aerospace Conference Proceedings (IEEE Cat. No.04TH8720).

[14]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[15]  S. Appavu alias Balamurugan,et al.  Learning to classify threatening e-mail , 2008, Int. J. Artif. Intell. Soft Comput..

[16]  Richard Clayton Email traffic: a quantitative snapshot , 2007, CEAS.

[17]  Yiming Yang,et al.  The Enron Corpus: A New Dataset for Email Classi(cid:12)cation Research , 2004 .

[18]  K. Selvakuberan,et al.  Combined Feature Selection and classification – A novel approach for the categorization of web pages , 2008 .

[19]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[20]  David B. Skillicorn,et al.  Detecting unusual email communication , 2005, CASCON.

[21]  Michael D. Smith,et al.  Predicting the Political Sentiment of Web Log Posts Using Supervised Machine Learning Techniques Coupled with Feature Selection , 2006, WEBKDD.

[22]  Andrew McCallum,et al.  A comparison of event models for naive bayes text classification , 1998, AAAI 1998.