Revising the Outputs of a Decision Tree with Expert Knowledge: Application to Intrusion Detection and Alert Correlation

Classifiers are well-known and efficient techniques used to predict the class of items descrided by a set of features. In many applications, it is important to take into account some extra knowledge in addition to the one encoded by the classifier. For example, in spam filtering which can be seen as a classification problem, it can make sense for a user to require that the spam filter predicts less than a given rate or number of spams. In this paper, we propose an approach allowing to combine expert knowledge with the results of a decision tree classifier. More precisely, we propose to revise the outputs of a decision tree in order to take into account the available expert knowledge. Our approach can be applied for any classifier where a probability distribution over the set of classes (or decisions) can be estimated from the output of the classification step. In this work, we analyze the advantage of adding expert knowledge to decision tree classifiers in the context of intrusion detection and alert correlation. In particular, we study how additional expert knowledge such as "it is expected that 80% of traffic will be normal" can be integrated in classification tasks. Our aim is to revise classifiers' outputs in order to fit the expert knowledge. Experimental studies on intrusion detection and alert correlation problems show that our approach improves the performances on different benchmarks.

[1]  Hervé Debar,et al.  Aggregation and Correlation of Intrusion-Detection Alerts , 2001, Recent Advances in Intrusion Detection.

[2]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[3]  Sara Matzner,et al.  An application of machine learning to network intrusion detection , 1999, Proceedings 15th Annual Computer Security Applications Conference (ACSAC'99).

[4]  Christopher Krügel,et al.  Bayesian event classification for intrusion detection , 2003, 19th Annual Computer Security Applications Conference, 2003. Proceedings..

[5]  Andrew H. Sung,et al.  Intrusion detection using neural networks and support vector machines , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[6]  Rafael D. C. Santos,et al.  Evaluation of data mining techniques for suspicious network activity classification using honeypots data , 2007, SPIE Defense + Commercial Sensing.

[7]  Frédéric Cuppens,et al.  Alert correlation in a cooperative intrusion detection framework , 2002, Proceedings 2002 IEEE Symposium on Security and Privacy.

[8]  Peng Ning,et al.  Constructing attack scenarios through correlation of intrusion alerts , 2002, CCS '02.

[9]  C. K. Chow,et al.  On optimum recognition error and reject tradeoff , 1970, IEEE Trans. Inf. Theory.

[10]  Dorothy E. Denning,et al.  An Intrusion-Detection Model , 1987, IEEE Transactions on Software Engineering.

[11]  Maria Papadaki,et al.  Investigating the problem of IDS false alarms: An experimental study using Snort , 2008, SEC.

[12]  Lior Rokach,et al.  Taxonomy for characterizing ensemble methods in classification tasks: A review and annotated bibliography , 2009, Comput. Stat. Data Anal..

[13]  Karim Tabia,et al.  Handling IDS' reliability in alert correlation: A Bayesian network-based model for handling IDS's reliability and controlling prediction/false alarm rate tradeoffs , 2010, 2010 International Conference on Security and Cryptography (SECRYPT).

[14]  Hajime Inoue,et al.  Comparing Anomaly Detection Techniques for HTTP , 2007, RAID.

[15]  Hervé Debar,et al.  A neural network component for an intrusion detection system , 1992, Proceedings 1992 IEEE Computer Society Symposium on Research in Security and Privacy.

[16]  Salem Benferhat,et al.  On the Use of Naive Bayesian Classifiers for Detecting Elementary and Coordinated Attacks , 2010, Fundam. Informaticae.

[17]  Marc Dacier,et al.  Towards a taxonomy of intrusion-detection systems , 1999, Comput. Networks.

[18]  Peter J. F. Lucas,et al.  Using Background Knowledge to Construct Bayesian Classifiers for Data-Poor Domains , 2004, SGAI Conf..

[19]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[20]  C. Aitken,et al.  The logic of decision , 2014 .

[21]  Moisés Goldszmidt,et al.  Properties and Benefits of Calibrated Classifiers , 2004, PKDD.

[22]  Salem Benferhat,et al.  On the Use of Decision Trees as Behavioral Approaches in Intrusion Detection , 2008, 2008 Seventh International Conference on Machine Learning and Applications.

[23]  Wojciech Tylman Anomaly-Based Intrusion Detection Using Bayesian Networks , 2008, 2008 Third International Conference on Dependability of Computer Systems DepCoS-RELCOMEX.

[24]  Zied Elouedi,et al.  Naive Bayes vs decision trees in intrusion detection systems , 2004, SAC '04.

[25]  Zoltan Domotor Probability Kinematics and Representation of Belief Change , 1980, Philosophy of Science.

[26]  Salem Benferhat,et al.  Alert Correlation based on a Logical Handling of Administrator Preferences and Knowledge , 2018, SECRYPT.