Sequential anomaly detection based on temporal-difference learning: Principles, models and case studies

Anomaly detection is an important problem that has been popularly researched within diverse research areas and application domains. One of the open problems in anomaly detection is the modeling and prediction of complex sequential data, which consist of a series of temporally related behavior patterns. In this paper, a novel sequential anomaly detection method based on temporal-difference (TD) learning is proposed, where the anomaly detection problem of multi-stage cyber attacks is considered as an application case. A Markov reward process model is presented for the anomaly detection and alarming process of sequential data and it is verified that when the reward function is properly defined, the anomaly probabilities of sequential behaviors are equivalent to the value functions of the Markov reward process. Therefore, TD learning algorithms in the reinforcement learning literature can be used to efficiently construct anomaly detection models of complex sequential behaviors by estimating the value functions of the Markov reward process. Compared with other machine learning methods for anomaly detection, the proposed approach has the advantage of simplified labeling process using delayed evaluative signals and the prediction accuracy can be improved even if labeled training data are limited. Based on the experimental results on intrusion detection of host computers using system call data, it was shown that the proposed anomaly detection method can achieve higher or at least comparable detection accuracies than other approaches including SVMs, and HMMs.

[1]  Dan Boneh,et al.  Proceedings of the 11th USENIX Security Symposium , 2002 .

[2]  Connie M. Borror,et al.  Robustness of the Markov-chain model for cyber-attack detection , 2004, IEEE Transactions on Reliability.

[3]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[4]  Somesh Jha,et al.  Markov chains, classifiers, and intrusion detection , 2001, Proceedings. 14th IEEE Computer Security Foundations Workshop, 2001..

[5]  Xin Xu,et al.  Kernel Least-Squares Temporal Difference Learning , 2006 .

[6]  E Flahavin,et al.  19th National Information Systems Security Conference , 1997 .

[7]  D. Vere-Jones Markov Chains , 1972, Nature.

[8]  Dirk Ourston,et al.  Applications of hidden Markov models to detecting multi-stage network attacks , 2003, 36th Annual Hawaii International Conference on System Sciences, 2003. Proceedings of the.

[9]  V. Rao Vemuri,et al.  Using Text Categorization Techniques for Intrusion Detection , 2002, USENIX Security Symposium.

[10]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[11]  Jaideep Srivastava,et al.  A Comparative Study of Anomaly Detection Schemes in Network Intrusion Detection , 2003, SDM.

[12]  Terran Lane,et al.  A Decision-Theoritic, Semi-Supervised Model for Intrusion Detection , 2006 .

[13]  Nitesh V. Chawla,et al.  Editorial: special issue on learning from imbalanced data sets , 2004, SKDD.

[14]  Christin Schäfer,et al.  Learning Intrusion Detection: Supervised or Unsupervised? , 2005, ICIAP.

[15]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[16]  Stephanie Forrest,et al.  Intrusion Detection Using Sequences of System Calls , 1998, J. Comput. Secur..

[17]  Don R. Hush,et al.  A Classification Framework for Anomaly Detection , 2005, J. Mach. Learn. Res..

[18]  James P. Egan,et al.  Signal detection theory and ROC analysis , 1975 .

[19]  Abdolreza Mirzaei,et al.  Intrusion detection using fuzzy association rules , 2009, Appl. Soft Comput..

[20]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[21]  Cannady,et al.  Next Generation Intrusion Detection: Autonomous Reinforcement Learning of Network Attacks , 2000 .

[22]  Anupam Joshi,et al.  Fuzzy clustering for intrusion detection , 2003, The 12th IEEE International Conference on Fuzzy Systems, 2003. FUZZ '03..

[23]  Yuxin Ding,et al.  Host-based intrusion detection using dynamic and static behavioral models , 2003, Pattern Recognit..

[24]  John N. Tsitsiklis,et al.  Analysis of Temporal-Diffference Learning with Function Approximation , 1996, NIPS.

[25]  Vasant Honavar,et al.  Learning Classifiers for Misuse Detection Using a Bag of System Calls Representation , 2005, ISI.

[26]  R. Sekar,et al.  Specification-based anomaly detection: a new approach for detecting network intrusions , 2002, CCS '02.

[27]  Xin Xu,et al.  An Adaptive Network Intrusion Detection Method Based on PCA and Support Vector Machines , 2005, ADMA.

[28]  Philip K. Chan,et al.  Learning nonstationary models of normal network traffic for detecting novel attacks , 2002, KDD.

[29]  Salvatore J. Stolfo,et al.  Adaptive Intrusion Detection: A Data Mining Approach , 2000, Artificial Intelligence Review.

[30]  Justin A. Boyan,et al.  Least-Squares Temporal Difference Learning , 1999, ICML.

[31]  Dong Xiang,et al.  Information-theoretic measures for anomaly detection , 2001, Proceedings 2001 IEEE Symposium on Security and Privacy. S&P 2001.

[33]  Vipin Kumar,et al.  Predicting rare classes: can boosting make any weak learner strong? , 2002, KDD.

[34]  Justin A. Boyan,et al.  Technical Update: Least-Squares Temporal Difference Learning , 2002, Machine Learning.