RETRACTED ARTICLE: Cognition based spam mail text analysis using combined approach of deep neural network classifier and random forest

Email Spam is a variety of automated spam where unbidden messages, used for business purpose, sent extensively to multiple mailing lists, individuals or newsgroups. To build a fruitful system for spam detection, we introduced Random Forest integrated with Deep Neural network to find the classification accuracy. The Random Forest algorithm uses a preordained probability of attributes in constructing their decision trees. The Gini measure is examined to rank the important features. The main objective is to grade the features using RF algorithm and to train the data using Deep Neural Network Classifier. Deep Neural Network Classifier model (DNNs) are trained using backpropagation algorithm in batch learning mode, which requires the entire training data to learn at once. The detector process was dynamically fit to the new data patterns till it reaches the spam coverage. Experimental results shows that classification rate of DNN is higher than compared to KNN and Support Vector Machine(SVM) with an accuracy of 88.59% while considering the top ranked five features.

[1]  Kevin R. Gee Using latent semantic indexing to filter spam , 2003, SAC '03.

[2]  Chien-Cheng Lee,et al.  Caption Localization and Detection for News Videos Using Frequency Analysis and Wavelet Features , 2007 .

[3]  Constantine D. Spyropoulos,et al.  An experimental comparison of naive Bayesian and keyword-based anti-spam filtering with personal e-mail messages , 2000, SIGIR '00.

[4]  Akebo Yamakami,et al.  Artificial Neural Networks For Content-based Web Spam Detection , 2012 .

[5]  D. Sculley,et al.  Relaxed online SVMs for spam filtering , 2007, SIGIR.

[6]  Dinesh Kumar,et al.  Development of Cloud Integrated Internet of Things Based Intruder Detection System , 2018 .

[7]  Nilanjan Dey,et al.  Enhanced resource allocation in mobile edge computing using reinforcement learning based MOACO algorithm for IIOT , 2020, Comput. Commun..

[8]  Qingcai Chen,et al.  Fuzzy deep belief networks for semi-supervised sentiment classification , 2014, Neurocomputing.

[9]  Muhammad Abulaish,et al.  A generic statistical approach for spam detection in Online Social Networks , 2013, Comput. Commun..

[10]  Alex Hai Wang,et al.  Don't follow me: Spam detection in Twitter , 2010, 2010 International Conference on Security and Cryptography (SECRYPT).

[11]  Geoffrey E. Hinton,et al.  A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..

[12]  Ali Selamat,et al.  Improved email spam detection model with negative selection algorithm and particle swarm optimization , 2014, Appl. Soft Comput..

[13]  Hossein Nezamabadi-pour,et al.  GA-based feature subset selection in a spam/non-spam detection system , 2012, 2012 International Conference on Computer and Communication Engineering (ICCCE).

[14]  Dong-Hong Ji,et al.  Neural networks for deceptive opinion spam detection: An empirical study , 2017, Inf. Sci..

[15]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[16]  P. Subbulakshmi,et al.  Optimization using Artificial Bee Colony based clustering approach for big data , 2018, Cluster Computing.

[17]  Anirban Mondal,et al.  On Effective E-mail Classification via Neural Networks , 2005, DEXA.

[18]  A. Suresh,et al.  Predictive big data analytic on demonetization data using support vector machine , 2018, Cluster Computing.

[19]  Suresh Annamalai,et al.  An Intelligent Grid Network Based on Cloud Computing Infrastructures , 2019, Advances in Computer and Electrical Engineering.

[20]  L. Kalaivani,et al.  Collaborative approach on mitigating spectrum sensing data hijack attack and dynamic spectrum allocation based on CASG modeling in wireless cognitive radio networks , 2017, Cluster Computing.

[21]  Yudong Zhang,et al.  Binary PSO with mutation operator for feature selection using decision tree applied to spam detection , 2014, Knowl. Based Syst..

[22]  Juan Martínez-Romo,et al.  Web Spam Detection: New Classification Features Based on Qualified Link Analysis and Language Models , 2010, IEEE Transactions on Information Forensics and Security.

[23]  Gary Robinson,et al.  A statistical approach to the spam problem , 2003 .

[24]  L. Kalaivani,et al.  Development of secured data transmission using machine learning-based discrete-time partially observed Markov model and energy optimization in cognitive radio networks , 2018, Neural Computing and Applications.

[25]  Suresh Annamalai,et al.  Cloud-Based Predictive Maintenance and Machine Monitoring for Intelligent Manufacturing for Automobile Industry , 2019, Advances in Computer and Electrical Engineering.

[26]  Christopher Meek,et al.  Challenges of the Email Domain for Text Classification , 2000, ICML.

[27]  Aristidis Likas,et al.  Deep Belief Networks for Spam Filtering , 2007 .

[28]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[29]  Pradeep Kumar Roy,et al.  Deep learning to filter SMS Spam , 2020, Future Gener. Comput. Syst..

[30]  Patrícia Augustin Jaques,et al.  An Analysis of Hierarchical Text Classification Using Word Embeddings , 2018, Inf. Sci..

[31]  Maozhen Li,et al.  An ontology enhanced parallel SVM for scalable spam filter training , 2013, Neurocomputing.

[32]  Ngoc Thanh Nguyen,et al.  A combined negative selection algorithm-particle swarm optimization for an email spam detection system , 2015, Eng. Appl. Artif. Intell..

[33]  Bo Yu,et al.  A comparative study for content-based dynamic spam classification using four machine learning algorithms , 2008, Knowl. Based Syst..

[34]  Mohamed Ghailani,et al.  A Study on Email Spam Filtering Techniques , 2010 .

[35]  Nilanjan Dey,et al.  Energy enhancement using Multiobjective Ant colony optimization with Double Q learning algorithm for IoT based cognitive radio networks , 2020, Comput. Commun..

[36]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[37]  D. Karthika Renuka,et al.  Spam Classification Based on Supervised Learning Using Machine Learning Techniques , 2011, 2011 International Conference on Process Automation, Control and Computing.

[38]  Ketan Kotecha,et al.  Expert Systems With Applications , 2022 .

[39]  R. Geetha,et al.  Cervical Cancer Identification with Synthetic Minority Oversampling Technique and PCA Analysis using Random Forest Classifier , 2019, Journal of Medical Systems.