Enabling Anomaly-based Intrusion Detection Through Model Generalization

Anomaly-based intrusion detection by the means of machine learning techniques is extensively studied in the literature mainly due to its promise to detect new attacks. However, despite the promising reported results, it is hardly deployed to real world environments. The main challenge in its adoption is the discrepancy between the accuracy rates obtained during the classifier development process and the rates obtained during its use in production environments. Such a discrepancy is mainly caused by non-representative training databases and nongeneralizable (scenario-specific) classifier’s model. This paper presents a method to create intrusion databases, which aims at mimicking the production environments characteristics by using well-known tools. Moreover, we present and evaluate a new validation technique, which aims at ensuring the generalization capacity of the obtained models, reached using cross-validating with different intrusion databases. The evaluation tests showed the feasibility of the proposed method. The feature selection technique ensured the model generalization capacity, improving its accuracy rate by 13%, while testing in different intrusion databases. Finally, the proposed anomaly-based approach was compared with Snort, reaching an accuracy rate of 99% against 27% of Snort for detecting DoS attacks.

[1]  Luiz Eduardo Soares de Oliveira,et al.  Stream learning and anomaly-based intrusion detection in the adversarial settings , 2017, 2017 IEEE Symposium on Computers and Communications (ISCC).

[2]  Richard P. Lippmann,et al.  An Overview of Issues in Testing Intrusion Detection Systems , 2003 .

[3]  Leonid Portnoy,et al.  Intrusion detection with unlabeled data using clustering , 2000 .

[4]  Philip K. Chan,et al.  An Analysis of the 1999 DARPA/Lincoln Laboratory Evaluation Data for Network Anomaly Detection , 2003, RAID.

[5]  Wei-Yang Lin,et al.  Intrusion detection by machine learning: A review , 2009, Expert Syst. Appl..

[6]  Vern Paxson,et al.  Outside the Closed World: On Using Machine Learning for Network Intrusion Detection , 2010, 2010 IEEE Symposium on Security and Privacy.

[7]  Luiz Eduardo Soares de Oliveira,et al.  Toward a reliable anomaly-based intrusion detection in real-world environments , 2017, Comput. Networks.

[8]  Kristopher Kendall,et al.  A Database of Computer Attacks for the Evaluation of Intrusion Detection Systems , 1999 .

[9]  Fabio Roli,et al.  Adversarial attacks against intrusion detection systems: Taxonomy, solutions and open issues , 2013, Inf. Sci..

[10]  Carrie Gates,et al.  Challenging the anomaly detection paradigm: a provocative discussion , 2006, NSPW '06.

[11]  Verónica Bolón-Canedo,et al.  Feature selection and classification in multiple class datasets: An application to KDD Cup 99 dataset , 2011, Expert Syst. Appl..

[12]  Ali A. Ghorbani,et al.  IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS 1 Toward Credible Evaluation of Anomaly-Based Intrusion-Detection Methods , 2022 .

[13]  Robert H. Deng,et al.  Active Semi-supervised Approach for Checking App Behavior against Its Description , 2015, 2015 IEEE 39th Annual Computer Software and Applications Conference.

[14]  Luiz Eduardo Soares de Oliveira,et al.  Towards an Energy-Efficient Anomaly-Based Intrusion Detection Engine for Embedded Systems , 2017, IEEE Transactions on Computers.

[15]  Matthew Roughan,et al.  The need for simulation in evaluating anomaly detectors , 2008, CCRV.

[16]  Kien A. Hua,et al.  Decision tree classifier for network intrusion detection with GA-based feature selection , 2005, ACM Southeast Regional Conference.

[17]  Ali A. Ghorbani,et al.  Toward developing a systematic approach to generate benchmark datasets for intrusion detection , 2012, Comput. Secur..