A Neural Network Architecture Combining Gated Recurrent Unit (GRU) and Support Vector Machine (SVM) for Intrusion Detection in Network Traffic Data

Gated Recurrent Unit (GRU) is a recently-developed variation of the long short-term memory (LSTM) unit, both of which are variants of recurrent neural network (RNN). Through empirical evidence, both models have been proven to be effective in a wide variety of machine learning tasks such as natural language processing, speech recognition, and text classification. Conventionally, like most neural networks, both of the aforementioned RNN variants employ the Softmax function as its final output layer for its prediction, and the cross-entropy function for computing its loss. In this paper, we present an amendment to this norm by introducing linear support vector machine (SVM) as the replacement for Softmax in the final output layer of a GRU model. Furthermore, the cross-entropy function shall be replaced with a margin-based function. While there have been similar studies, this proposal is primarily intended for binary classification on intrusion detection using the 2013 network traffic data from the honeypot systems of Kyoto University. Results show that the GRU-SVM model performs relatively higher than the conventional GRU-Softmax model. The proposed model reached a training accuracy of ≈81.54% and a testing accuracy of ≈84.15%, while the latter was able to reach a training accuracy of ≈63.07% and a testing accuracy of ≈70.75%. In addition, the juxtaposition of these two final output layers indicate that the SVM would outperform Softmax in prediction time - a theoretical implication which was supported by the actual training and testing time in the study.

[1]  Eugene H. Spafford,et al.  An Application of Pattern Matching in Intrusion Detection , 1994 .

[2]  Philip K. Chan,et al.  An Analysis of the 1999 DARPA/Lincoln Laboratory Evaluation Data for Network Anomaly Detection , 2003, RAID.

[3]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[4]  Guadalupe I. Janoski,et al.  Intrusion Detection : Support Vector Machines and Neural Networks , 2002 .

[5]  Dustin Boswell,et al.  Introduction to Support Vector Machines , 2002 .

[6]  A. Roli Artificial Neural Networks , 2012, Lecture Notes in Computer Science.

[7]  Mantas Lukosevicius,et al.  A Practical Guide to Applying Echo State Networks , 2012, Neural Networks: Tricks of the Trade.

[8]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[9]  Paul J. Werbos,et al.  Backpropagation Through Time: What It Does and How to Do It , 1990, Proc. IEEE.

[10]  James Cannady,et al.  Artificial Neural Networks for Misuse Detection , 1998 .

[11]  Harold Joseph Highland,et al.  The 17th NSCS abstructArtificial Intelligence and Intrusion Detection: Current and Future Directions : Jeremy Frank, University of California, Davis, CA , 1995 .

[12]  Wenke Lee,et al.  Cost-based Modeling and Evaluation for Data Mining With Application to Fraud and Intrusion Detection : Results from the JAM Project ∗ , 2008 .

[13]  Michael Schatz,et al.  Learning Program Behavior Profiles for Intrusion Detection , 1999, Workshop on Intrusion Detection and Network Monitoring.

[14]  Yoshua Bengio,et al.  Attention-Based Models for Speech Recognition , 2015, NIPS.

[15]  Michael Negnevitsky,et al.  Artificial Intelligence: A Guide to Intelligent Systems , 2001 .

[16]  Herbert Jaeger,et al.  Echo state network , 2007, Scholarpedia.

[17]  David Vandyke,et al.  Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems , 2015, EMNLP.

[18]  Yichuan Tang,et al.  Deep Learning using Linear Support Vector Machines , 2013, 1306.0239.

[19]  Simon Farrell,et al.  Computational Modeling in Cognition: Principles and Practice , 2010 .

[20]  Rung Ching Chen,et al.  Using Rough Set and Support Vector Machine for Network Intrusion Detection System , 2009, 2009 First Asian Conference on Intelligent Information and Database Systems.

[21]  Biswanath Mukherjee,et al.  A Methodology for Testing Intrusion Detection Systems , 1996, IEEE Trans. Software Eng..

[22]  Wes McKinney,et al.  Data Structures for Statistical Computing in Python , 2010, SciPy.

[23]  Claus Nebauer,et al.  Evaluation of convolutional neural networks for visual recognition , 1998, IEEE Trans. Neural Networks.

[24]  Tanaka Hidehiko,et al.  Detecting Fraudulent Behavior Using Recurrent Neural Networks , 2016 .

[25]  Anil K. Jain,et al.  Artificial Neural Networks: A Tutorial , 1996, Computer.

[26]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[27]  Salvatore J. Stolfo,et al.  Cost-based modeling for fraud and intrusion detection: results from the JAM project , 2000, Proceedings DARPA Information Survivability Conference and Exposition. DISCEX'00.

[28]  Klaus-Robert Müller,et al.  Intrusion detection in unlabeled data with quarter-sphere Support Vector Machines , 2004 .

[29]  A. Alalshekmubarak,et al.  A novel approach combining recurrent neural network and support vector machines for time series classification , 2013, 2013 9th International Conference on Innovations in Information Technology (IIT).

[30]  Florence March,et al.  2016 , 2016, Affair of the Heart.

[31]  Andrew H. Sung,et al.  Intrusion detection using neural networks and support vector machines , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[32]  Julien Clinton Sprott,et al.  Artificial neural networks: powerful tools for modeling chaotic behavior in the nervous system , 2014, Front. Comput. Neurosci..

[33]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[34]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[35]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[36]  Dorothy E. Denning,et al.  An Intrusion-Detection Model , 1987, IEEE Transactions on Software Engineering.

[37]  Henry Markram,et al.  Real-Time Computing Without Stable States: A New Framework for Neural Computation Based on Perturbations , 2002, Neural Computation.

[38]  A. Azzouz 2011 , 2020, City.

[39]  S. M. García,et al.  2014: , 2020, A Party for Lazarus.

[40]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[41]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[42]  Marti A. Hearst Trends & Controversies: Support Vector Machines , 1998, IEEE Intell. Syst..

[43]  R. Patidar,et al.  Credit Card Fraud Detection Using Neural Network , 2011 .

[44]  David Verstraeten Reservoir Computing: computation with dynamical systems , 2009 .

[45]  Andrew H. Sung,et al.  Feature Selection for Intrusion Detection with Neural Networks and Support Vector Machines , 2003 .

[46]  Shyam Visweswaran,et al.  Improving Classification Performance with Discretization on Biomedical Datasets , 2008, AMIA.