Building Auto-Encoder Intrusion Detection System based on random forest feature selection

Abstract Machine learning techniques have been widely used in intrusion detection for many years. However, these techniques are still suffer from lack of labeled dataset, heavy overhead and low accuracy. To improve classification accuracy and reduce training time, this paper proposes an effective deep learning method, namely AE-IDS (Auto-Encoder Intrusion Detection System) based on random forest algorithm. This method constructs the training set with feature selection and feature grouping. After training, the model can predict the results with auto-encoder, which greatly reduces the detection time and effectively improves the prediction accuracy. The experimental results show that the proposed method is superior to traditional machine learning based intrusion detection methods in terms of easy training, strong adaptability, and high detection accuracy.

[1]  Chenguang Wang,et al.  Research on DDoS Attacks Detection Based on RDF-SVM , 2017, 2017 10th International Conference on Intelligent Computation Technology and Automation (ICICTA).

[2]  B. Tripathy,et al.  KMST+: A K-Means++-Based Minimum Spanning Tree Algorithm , 2018, Smart Innovations in Communication and Computational Sciences.

[3]  Suleyman Serdar Kozat,et al.  Efficient Online Learning Algorithms Based on LSTM Neural Networks , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[4]  Bertrand Michel,et al.  Correlation and variable importance in random forests , 2013, Statistics and Computing.

[5]  Julio Ortega Lopera,et al.  PCA filtering and probabilistic SOM for network intrusion detection , 2015, Neurocomputing.

[6]  Chao Xu,et al.  Autoencoder Inspired Unsupervised Feature Selection , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[7]  Yuval Elovici,et al.  Kitsune: An Ensemble of Autoencoders for Online Network Intrusion Detection , 2018, NDSS.

[8]  Shie-Jue Lee,et al.  Network intrusion detection using equality constrained-optimization-based extreme learning machines , 2018, Knowl. Based Syst..

[9]  Mansoor Alam,et al.  A Deep Learning Approach for Network Intrusion Detection System , 2016, EAI Endorsed Trans. Security Safety.

[10]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[11]  Amaury Lendasse,et al.  Adaptive and online network intrusion detection system using clustering and Extreme Learning Machines , 2017, J. Frankl. Inst..

[12]  Wathiq Laftah Al-Yaseen,et al.  Multi-level hybrid support vector machine and extreme learning machine based on modified K-means for intrusion detection system , 2017, Expert Syst. Appl..

[13]  Erhan Guven,et al.  A Survey of Data Mining and Machine Learning Methods for Cyber Security Intrusion Detection , 2016, IEEE Communications Surveys & Tutorials.

[14]  Shadi Aljawarneh,et al.  Anomaly-based intrusion detection system through feature selection analysis and building hybrid efficient model , 2017, J. Comput. Sci..

[15]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[16]  Mahmood Yousefi-Azar,et al.  Autoencoder-based feature learning for cyber security applications , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[17]  Korris Fu-Lai Chung,et al.  Transfer affinity propagation-based clustering , 2016, Inf. Sci..

[18]  André Uschmajew,et al.  Finding a low-rank basis in a matrix subspace , 2015, Math. Program..

[19]  Pierre Baldi,et al.  Learning in the machine: Random backpropagation and the deep learning channel , 2016, Artif. Intell..

[20]  Robert C. Atkinson,et al.  Threat analysis of IoT networks using artificial neural network intrusion detection system , 2016, 2016 International Symposium on Networks, Computers and Communications (ISNCC).

[21]  Yun Fu,et al.  Feature Selection Guided Auto-Encoder , 2017, AAAI.

[22]  Michael K. Ng,et al.  Subspace clustering using affinity propagation , 2015, Pattern Recognit..

[23]  Ali A. Ghorbani,et al.  Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization , 2018, ICISSP.

[24]  Randy C. Paffenroth,et al.  Anomaly Detection with Robust Deep Autoencoders , 2017, KDD.

[25]  Bhanukiran Vinzamuri,et al.  A Survey of Partitional and Hierarchical Clustering Algorithms , 2018, Data Clustering: Algorithms and Applications.

[26]  Takehisa Yairi,et al.  Anomaly Detection Using Autoencoders with Nonlinear Dimensionality Reduction , 2014, MLSDA'14.

[27]  Hugo Martins,et al.  Outliers detection methods in wireless sensor networks , 2019, Artificial Intelligence Review.

[28]  Alfredo De Santis,et al.  Network anomaly detection with the restricted Boltzmann machine , 2013, Neurocomputing.

[29]  Tinghuai Ma,et al.  An efficient and scalable density-based clustering algorithm for datasets with complex structures , 2016, Neurocomputing.

[30]  Baihai Zhang,et al.  Back Propagation Convex Extreme Learning Machine , 2018 .

[31]  T. Chai,et al.  Root mean square error (RMSE) or mean absolute error (MAE)? – Arguments against avoiding RMSE in the literature , 2014 .

[32]  Léon Bottou,et al.  Stochastic Gradient Descent Tricks , 2012, Neural Networks: Tricks of the Trade.

[33]  Keke Wu,et al.  Identification of Abnormal Network Traffic Using Support Vector Machine , 2017, 2017 18th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT).

[34]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.