Spam filtering in SMS using recurrent neural networks

Short Message Service (SMS) is one of the mobile communication services that allows easy and inexpensive communication. Producing unwanted messages with the aim of advertising or harassment and sending these messages on SMS have become the biggest challenge in this service. Various methods have been presented to detect unsolicited short messages; many of which are based on machine learning. Neural Networks have been applied to separate the unwanted text messages (known as spam) from normal short messages (known as ham) in SMS. To the best of our knowledge, Recurrent Neural Network (RNN) has not been used in this issue yet. In this paper, we proposed a new method which utilizes RNN to separate the ham and spam with variable length sequences; even though we used a fixed sequence length. The proposed method achieved an accuracy of 98.11, indicates a considerable improvement compared to Support Vector Machine (SVM), token-based SVM and Bayesian algorithms with accuracies of 97.81, 97.64, and 80.54, respectively.

[1]  Donghai Guan,et al.  SMS Classification Based on Naïve Bayes Classifier and Apriori Algorithm Frequent Itemset , 2014 .

[2]  Ying Li,et al.  A soft computing method to predict sludge volume index based on a recurrent self-organizing neural network , 2016, Appl. Soft Comput..

[3]  Donghai Guan,et al.  Semi-supervised learning using frequent itemset and ensemble learning for SMS classification , 2015, Expert Syst. Appl..

[4]  Walmir M. Caminhas,et al.  A review of machine learning approaches to Spam filtering , 2009, Expert Syst. Appl..

[5]  Subhashini Venugopalan,et al.  Translating Videos to Natural Language Using Deep Recurrent Neural Networks , 2014, NAACL.

[6]  Sarah Jane Delany,et al.  SMS spam filtering: Methods and data , 2012, Expert Syst. Appl..

[7]  José María Gómez Hidalgo,et al.  Content based SMS spam filtering , 2006, DocEng '06.

[8]  Qian Wang,et al.  Studying of Classifying Junk Messages Based on The Data Mining , 2009, 2009 International Conference on Management and Service Science.

[9]  Marcus Rohrbach,et al.  Translating Videos to Natural Language Using Deep Recurrent Neural Networks , 2014, NAACL.

[10]  Akebo Yamakami,et al.  Contributions to the study of SMS spam filtering: new collection and results , 2011, DocEng '11.

[11]  Ting Wang,et al.  Index-based Online Text Classification for SMS Spam Filtering , 2010, J. Comput..

[12]  Anderson Paulo de Paiva,et al.  Factorial design analysis applied to the performance of SMS anti-spam filtering systems , 2016, Expert Syst. Appl..

[13]  Toby P. Breckon,et al.  SMS Spam Filtering Using Probabilistic Topic Modelling and Stacked Denoising Autoencoder , 2016, ICANN.

[14]  Liang Guo,et al.  A recurrent neural network based health indicator for remaining useful life prediction of bearings , 2017, Neurocomputing.

[15]  Shijian Lu,et al.  Accurate recognition of words in scenes without character segmentation using recurrent neural network , 2017, Pattern Recognit..

[16]  El-Sayed M. El-Alfy,et al.  Spam filtering framework for multimodal mobile communication based on dendritic cell algorithm , 2016, Future Gener. Comput. Syst..

[17]  Wei Zheng,et al.  Filtering Short Message Spam of Group Sending Using CAPTCHA , 2008, First International Workshop on Knowledge Discovery and Data Mining (WKDD 2008).

[18]  Nan Li,et al.  A New Spam Short Message Classification , 2009, 2009 First International Workshop on Education Technology and Computer Science.

[19]  Ali Dehghantanha,et al.  A Two-Layer Dimension Reduction and Two-Tier Classification Model for Anomaly-Based Intrusion Detection in IoT Backbone Networks , 2019, IEEE Transactions on Emerging Topics in Computing.

[20]  Lina Zhou,et al.  Improving Static SMS Spam Detection by Using New Content-based Features , 2014, AMCIS.