A Machine Learning Approach to Anomaly Detection

Much of the intrusion detection research focuses on signature (misuse) detection, where models are built to recognize known attacks. However, signature detection, by its nature, cannot detect novel attacks. Anomaly detection focuses on modeling the normal behavior and identifying significant deviations, which could be novel attacks. In this paper we explore two machine learning methods that can construct anomaly detection models from past behavior. The first method is a rule learning algorithm that characterizes normal behavior in the absence of labeled attack data. The second method uses a clustering algorithm to identify outliers.

[1]  Foster Provost,et al.  Tree Induction vs. Logistic Regression for Learning Rankings based on Likelihood of Class Membership , 2002 .

[2]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[3]  Pedro M. Domingos,et al.  Tree Induction for Probability-Based Ranking , 2003, Machine Learning.

[4]  Stephanie Forrest,et al.  A sense of self for Unix processes , 1996, Proceedings 1996 IEEE Symposium on Security and Privacy.

[5]  Peter Clark,et al.  The CN2 induction algorithm , 2004, Machine Learning.

[6]  Carla E. Brodley,et al.  Temporal sequence learning and data reduction for anomaly detection , 1998, CCS '98.

[7]  Vern Paxson,et al.  Bro: a system for detecting network intruders in real-time , 1998, Comput. Networks.

[8]  Sridhar Ramaswamy,et al.  Efficient algorithms for mining outliers from large data sets , 2000, SIGMOD '00.

[9]  R. Sekar,et al.  A fast automaton-based method for detecting anomalous program behaviors , 2001, Proceedings 2001 IEEE Symposium on Security and Privacy. S&P 2001.

[10]  Sushil Jajodia,et al.  Detecting Novel Network Intrusions Using Bayes Estimators , 2001, SDM.

[11]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[12]  Philip K. Chan,et al.  Learning nonstationary models of normal network traffic for detecting novel attacks , 2002, KDD.

[13]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[14]  Pedro M. Domingos,et al.  On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.

[15]  Michael Schatz,et al.  Learning Program Behavior Profiles for Intrusion Detection , 1999, Workshop on Intrusion Detection and Network Monitoring.

[16]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[17]  Stuart Staniford-Chen,et al.  Practical Automated Detection of Stealthy Portscans , 2002, J. Comput. Secur..

[18]  Philip Chan,et al.  Learning States and Rules for Time Series Anomaly Detection , 2004, FLAIRS.

[19]  Ian H. Witten,et al.  The zero-frequency problem: Estimating the probabilities of novel events in adaptive text compression , 1991, IEEE Trans. Inf. Theory.

[20]  Peter G. Neumann,et al.  Experience with EMERALD to Date , 1999, Workshop on Intrusion Detection and Network Monitoring.

[21]  Salvatore J. Stolfo,et al.  Detecting Malicious Software by Monitoring Anomalous Windows Registry Accesses , 2002, RAID.

[22]  Stephanie Forrest,et al.  Infect Recognize Destroy , 1996 .

[23]  Richard Lippmann,et al.  The 1999 DARPA off-line intrusion detection evaluation , 2000, Comput. Networks.

[24]  H. Javitz,et al.  Detecting Unusual Program Behavior Using the Statistical Component of the Next-generation Intrusion Detection Expert System ( NIDES ) 1 , 1997 .

[25]  Eleazar Eskin,et al.  A GEOMETRIC FRAMEWORK FOR UNSUPERVISED ANOMALY DETECTION: DETECTING INTRUSIONS IN UNLABELED DATA , 2002 .

[26]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[27]  Philip S. Yu,et al.  Outlier detection for high dimensional data , 2001, SIGMOD '01.

[28]  Raymond T. Ng,et al.  Algorithms for Mining Distance-Based Outliers in Large Datasets , 1998, VLDB.

[29]  Mohammed J. Zaki,et al.  ADMIT: anomaly-based data mining for intrusions , 2002, KDD.

[30]  Kristopher Kendall,et al.  A Database of Computer Attacks for the Evaluation of Intrusion Detection Systems , 1999 .

[31]  Tim Niblett,et al.  Constructing Decision Trees in Noisy Domains , 1987, EWSL.

[32]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[33]  Philip K. Chan,et al.  Learning Models of Network Traffic for Detecting Novel Attacks , 2002 .

[34]  Martin Roesch,et al.  Snort - Lightweight Intrusion Detection for Networks , 1999 .

[35]  Christopher Krügel,et al.  Service specific anomaly detection for network intrusion detection , 2002, SAC '02.

[36]  Leonid Portnoy,et al.  Intrusion detection with unlabeled data using clustering , 2000 .