Anomaly detection using weak estimators

Anomaly detection involves identifying observations that deviate from the normal behavior of a system. One of the ways to achieve this is by identifying the phenomena that characterize “normal” observations. Subsequently, based on the characteristics of data learned from the “normal” observations, new observations are classified as being either “normal” or not. Most state-of-the-art approaches, especially those which belong to the family parameterized statistical schemes, work under the assumption that the underlying distributions of the observations are stationary. That is, they assume that the distributions that are learned during the training (or learning) phase, though unknown, are not time-varying. They further assume that the same distributions are relevant even as new observations are encountered. Although such a “stationarity” assumption is relevant for many applications, there are some anomaly detection problems where stationarity cannot be assumed. For example, in network monitoring, the patterns which are learned to represent normal behavior may change over time due to several factors such as network infrastructure expansion, new services, growth of user population, etc. Similarly, in meteorology, identifying anomalous temperature patterns involves taking into account seasonal changes of normal observations. Detecting anomalies or outliers under these circumstances introduces several challenges. Indeed, the ability to adapt to changes in non-stationary environments is necessary so that anomalous observations can be identified even with changes in what would otherwise be classified as “normal” behavior. In this paper, we proposed to apply weak estimation theory for anomaly detection in dynamic environments. In particular, we apply this theory to detect anomaly activities in system calls. Our experimental results demonstrate that our proposal is both feasible and effective for the detection of such anomalous activities.

[1]  Barak A. Pearlmutter,et al.  Detecting intrusions using system calls: alternative data models , 1999, Proceedings of the 1999 IEEE Symposium on Security and Privacy (Cat. No.99CB36344).

[2]  Ben J. A. Kröse,et al.  Learning from delayed rewards , 1995, Robotics Auton. Syst..

[3]  Mark Goadrich,et al.  The relationship between Precision-Recall and ROC curves , 2006, ICML.

[4]  B.J. Oommen,et al.  Stochastic Automata-Based Estimators for Adaptively Compressing Files With Nonstationary Distributions , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[5]  H. Kushner,et al.  Stochastic Approximation and Recursive Algorithms and Applications , 2003 .

[6]  B. John Oommen,et al.  A Fault-Tolerant Routing Algorithm for Mobile Ad Hoc Networks Using a Stochastic Learning-Based Weak Estimation Procedure , 2006, 2006 IEEE International Conference on Wireless and Mobile Computing, Networking and Communications.

[7]  B. John Oommen,et al.  Stochastic learning-based weak estimation of multinomial random variables and its applications to pattern recognition in non-stationary environments , 2006, Pattern Recognit..

[8]  Kumpati S. Narendra,et al.  Learning automata - an introduction , 1989 .

[9]  Salvatore J. Stolfo,et al.  One-Class Training for Masquerade Detection , 2003 .

[10]  Luis Rueda,et al.  Toward New Paradigms to Combating Internet Child Pornography , 2006, 2006 Canadian Conference on Electrical and Computer Engineering.

[11]  Vasant Honavar,et al.  Learning Classifiers for Misuse Detection Using a Bag of System Calls Representation , 2005, ISI.

[12]  Stephanie Forrest,et al.  A sense of self for Unix processes , 1996, Proceedings 1996 IEEE Symposium on Security and Privacy.

[13]  Salvatore J. Stolfo,et al.  A comparative evaluation of two algorithms for Windows Registry Anomaly Detection , 2005, J. Comput. Secur..

[14]  Christopher Krügel,et al.  Anomalous system call detection , 2006, TSEC.