A hybrid deep learning model for efficient intrusion detection in big data environment

Abstract The volume of network and Internet traffic is expanding daily, with data being created at the zettabyte to petabyte scale at an exceptionally high rate. These can be characterized as big data, because they are large in volume, variety, velocity, and veracity. Security threats to networks, the Internet, websites, and organizations are growing alongside this growth in usage. Detecting intrusions in such a big data environment is difficult. Various intrusion-detection systems (IDSs) using artificial intelligence or machine learning have been proposed for different types of network attacks, but most of these systems either cannot recognize unknown attacks or cannot respond to such attacks in real time. Deep learning models, recently applied to large-scale big data analysis, have shown remarkable performance in general but have not been examined for detection of intrusions in a big data environment. This paper proposes a hybrid deep learning model to efficiently detect network intrusions based on a convolutional neural network (CNN) and a weight-dropped, long short-term memory (WDLSTM) network. We use the deep CNN to extract meaningful features from IDS big data and WDLSTM to retain long-term dependencies among extracted features to prevent overfitting on recurrent connections. The proposed hybrid method was compared with traditional approaches in terms of performance on a publicly available dataset, demonstrating its satisfactory performance.

[1]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[2]  Ravi Sankar,et al.  A Survey of Intrusion Detection Systems in Wireless Sensor Networks , 2014, IEEE Communications Surveys & Tutorials.

[3]  Farid Melgani,et al.  One‐dimensional convolutional neural networks for spectroscopic signal regression , 2018 .

[4]  Yoshua Bengio,et al.  End-to-End Online Writer Identification With Recurrent Neural Network , 2017, IEEE Transactions on Human-Machine Systems.

[5]  Jason Cong,et al.  Caffeine: Toward Uniformed Representation and Acceleration for Deep Convolutional Neural Networks , 2019, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[6]  C. L. Philip Chen,et al.  Data-intensive applications, challenges, techniques and technologies: A survey on Big Data , 2014, Inf. Sci..

[7]  Pramod K. Varshney,et al.  Long short-term memory-based deep recurrent neural networks for target tracking , 2019, Inf. Sci..

[8]  Mohamed Amine Ferrag,et al.  Blockchain and Random Subspace Learning-Based IDS for SDN-Enabled Industrial IoT Security , 2019, Sensors.

[9]  Dorothy E. Denning,et al.  An Intrusion-Detection Model , 1987, IEEE Transactions on Software Engineering.

[10]  K. P. Soman,et al.  Deep Learning Approach for Intelligent Intrusion Detection System , 2019, IEEE Access.

[11]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[12]  Bo Li,et al.  Multi-scale 3D deep convolutional neural network for hyperspectral image classification , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[13]  Farrukh Aslam Khan,et al.  TSDL: A Two-Stage Deep Learning Model for Efficient Network Intrusion Detection , 2019, IEEE Access.

[14]  Le Zhang,et al.  A survey of randomized algorithms for training neural networks , 2016, Inf. Sci..

[15]  Awais Ahmad,et al.  Real time intrusion detection system for ultra-high-speed big data environments , 2016, The Journal of Supercomputing.

[16]  Thomas Blaschke,et al.  The rise of deep learning in drug discovery. , 2018, Drug discovery today.

[17]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[18]  Wray L. Buntine,et al.  Computing second derivatives in feed-forward networks: a review , 1994, IEEE Trans. Neural Networks.

[19]  Fan Zhang,et al.  Deep Convolutional Neural Networks for Hyperspectral Image Classification , 2015, J. Sensors.

[20]  U. Rajendra Acharya,et al.  Deep convolutional neural network for the automated detection and diagnosis of seizure using EEG signals , 2017, Comput. Biol. Medicine.

[21]  Hermann Ney,et al.  From Feedforward to Recurrent LSTM Neural Networks for Language Modeling , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[22]  G. Huiskamp,et al.  The depolarization sequence of the human heart surface computed from measured body surface potentials , 1988, IEEE Transactions on Biomedical Engineering.

[23]  Farrukh Aslam Khan,et al.  A Comparative Study of Machine Learning Classifiers for Network Intrusion Detection , 2019, ICAIS.

[24]  Mohammad Mehedi Hassan,et al.  A Hybrid Deep Learning Model for Human Activity Recognition Using Multimodal Body Sensing Data , 2019, IEEE Access.

[25]  Pradeep Kumar,et al.  A spatio-temporal model for EEG-based person identification , 2019, Multimedia Tools and Applications.

[26]  Ali A. Ghorbani,et al.  Toward developing a systematic approach to generate benchmark datasets for intrusion detection , 2012, Comput. Secur..

[27]  Jill Slay,et al.  The evaluation of Network Anomaly Detection Systems: Statistical analysis of the UNSW-NB15 data set and the comparison with the KDD99 data set , 2016, Inf. Secur. J. A Glob. Perspect..