RNN-VED for Reducing False Positive Alerts in Host-based Anomaly Detection Systems

Host-based Intrusion Detection Systems HIDS are often based on anomaly detection. Several studies deal with anomaly detection by analyzing the system-call traces and get good detection rates but also a high rate off alse positives. In this paper, we propose a new anomaly detection approach applied on the system-call traces. The normal behavior learning is done using a Sequence to sequence model based on a Variational Encoder-Decoder (VED) architecture that integrates Recurrent Neural Networks (RNN) cells. We exploit the semantics behind the invoking order of system-calls that are then seen as sentences. A preprocessing phase is added to structure and optimize the model input-data representation. After the learning step, a one-class classification is run to categorize the sequences as normal or abnormal. The architecture may be used for predicting abnormal behaviors. The tests are achieved on the ADFA-LD dataset.

[1]  Christopher Krügel,et al.  Bayesian event classification for intrusion detection , 2003, 19th Annual Computer Security Applications Conference, 2003. Proceedings..

[2]  Barak A. Pearlmutter,et al.  Detecting intrusions using system calls: alternative data models , 1999, Proceedings of the 1999 IEEE Symposium on Security and Privacy (Cat. No.99CB36344).

[3]  Ali H. Mirza,et al.  Computer network intrusion detection using sequential LSTM Neural Networks autoencoders , 2018, 2018 26th Signal Processing and Communications Applications Conference (SIU).

[4]  Stephen Clark,et al.  Latent Variable Dialogue Models and their Diversity , 2017, EACL.

[5]  V. Rao Vemuri,et al.  Use of K-Nearest Neighbor classifier for intrusion detection , 2002, Comput. Secur..

[6]  Thad Hughes,et al.  Recurrent neural networks for voice activity detection , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[7]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[8]  Joelle Pineau,et al.  A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues , 2016, AAAI.

[9]  Yongqiang Wang,et al.  Simplifying long short-term memory acoustic models for fast training and decoding , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]  Ahmed Elsherif,et al.  Automatic Intrusion Detection System Using Deep Recurrent Neural Network Paradigm , 2018 .

[11]  Min Zhang,et al.  Variational Neural Machine Translation , 2016, EMNLP.

[12]  W. B. Cavnar,et al.  N-gram-based text categorization , 1994 .

[13]  Brian Hutchinson,et al.  Deep Learning for Unsupervised Insider Threat Detection in Structured Cybersecurity Data Streams , 2017, AAAI Workshops.

[14]  D Zipser,et al.  Learning the hidden structure of speech. , 1988, The Journal of the Acoustical Society of America.

[15]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[16]  Somesh Jha,et al.  Markov chains, classifiers, and intrusion detection , 2001, Proceedings. 14th IEEE Computer Security Foundations Workshop, 2001..

[17]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[18]  Yan Gao,et al.  Predicting the intrusion intentions by observing system call sequences , 2004, Comput. Secur..

[19]  Jack W. Stokes,et al.  Malware classification with LSTM and GRU language models and a character-level CNN , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[20]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[21]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[22]  Michael Schatz,et al.  Learning Program Behavior Profiles for Intrusion Detection , 1999, Workshop on Intrusion Detection and Network Monitoring.

[23]  Stephanie Forrest,et al.  A sense of self for Unix processes , 1996, Proceedings 1996 IEEE Symposium on Security and Privacy.

[24]  Paul Jacob,et al.  Host Based Intrusion Detection System with Combined CNN/RNN Model , 2018, Nemesis/UrbReas/SoGood/IWAISe/GDM@PKDD/ECML.

[25]  Dorothy E. Denning,et al.  An Intrusion-Detection Model , 1987, IEEE Transactions on Software Engineering.

[26]  Jiankun Hu,et al.  A Semantic Approach to Host-Based Intrusion Detection Systems Using Contiguousand Discontiguous System Call Patterns , 2014, IEEE Transactions on Computers.

[27]  Jian Wang,et al.  Intrusion Prediction With System-Call Sequence-to-Sequence Model , 2018, IEEE Access.

[28]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[29]  Abdelwahab Hamou-Lhadj,et al.  An anomaly detection system based on variable N-gram features and one-class SVM , 2017, Inf. Softw. Technol..

[30]  L. Baum,et al.  A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .

[31]  Christopher Krügel,et al.  A quantitative study of accuracy in system call-based malware detection , 2012, ISSTA 2012.

[32]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.