Integrating Intelligent Anomaly Detection Agents into Distributed Monitoring Systems

High-performance computing clusters have be- come critical computing resources in many sensitive and/or economically important areas. Anomalies in such systems can be caused by activities such as user misbehavior, intrusions, corrupted data, deadlocks, and failure of cluster components. Effective detection of these anomalies has become a high pri- ority because of the need to guarantee security, privacy and reliability. This paper describes the integration of intelligent anomaly agents and traditional monitoring systems for high- performance distributed systems. The intelligent agents pre- sented in this study employ machine learning techniques to develop profiles of normal behavior as seen in sequences of operating system calls (kernel-level monitoring) and function calls (user-level monitoring) generated by an application. The Ganglia monitoring system was used as a test bed for inte- gration case studies. Mechanisms provided by Ganglia make it relatively easy to integrate anomaly detection systems and to visualize the output of the agents. The results provided demonstrate that the integrated intelligent agents can detect the execution of unauthorized applications and network faults that are not obvious in the standard output of traditional monitoring systems. Hidden Markov models working in user space and neural network models working in kernel space are shown to be effective. Simultaneous monitoring in both user space and kernel space is also demonstrated.

[1]  Philippe Augerat,et al.  Scalable monitoring and configuration tools for grids and clusters , 2002, Proceedings 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing.

[2]  David E. Culler,et al.  The ganglia distributed monitoring system: design, implementation, and experience , 2004, Parallel Comput..

[3]  Barak A. Pearlmutter,et al.  Detecting intrusions using system calls: alternative data models , 1999, Proceedings of the 1999 IEEE Symposium on Security and Privacy (Cat. No.99CB36344).

[4]  R. Sekar,et al.  User-Level Infrastructure for System Call Interposition: A Platform for Intrusion Detection and Confinement , 2000, NDSS.

[5]  T. Mitchem,et al.  Using kernel hypervisors to secure applications , 1997, Proceedings 13th Annual Computer Security Applications Conference.

[6]  Stephanie Forrest,et al.  Intrusion Detection Using Sequences of System Calls , 1998, J. Comput. Secur..

[7]  Peter G. Neumann,et al.  EMERALD: Event Monitoring Enabling Responses to Anomalous Live Disturbances , 1997, CCS 2002.

[8]  Kymie M. C. Tan,et al.  Determining the operational limits of an anomaly-based intrusion detector , 2003, IEEE J. Sel. Areas Commun..

[9]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[10]  Eugene H. Spafford,et al.  Generation of Application Level Audit Data via Library Interposition , 1998 .

[11]  Moreno Marzolla,et al.  A performance monitoring system for large computing clusters , 2003, Eleventh Euromicro Conference on Parallel, Distributed and Network-Based Processing, 2003. Proceedings..

[12]  Jeff A. Bilmes,et al.  What HMMs Can Do , 2006, IEICE Trans. Inf. Syst..

[13]  Anthony Skjellum,et al.  MPI/FT: A Model-Based Approach to Low-Overhead Fault Tolerant Message-Passing Middleware , 2004, Cluster Computing.

[14]  Lain L. MacDonald,et al.  Hidden Markov and Other Models for Discrete- valued Time Series , 1997 .

[15]  Michael Schatz,et al.  Learning Program Behavior Profiles for Intrusion Detection , 1999, Workshop on Intrusion Detection and Network Monitoring.

[16]  H. H. Chen,et al.  Recurrent neural networks, hidden Markov models and stochastic grammars , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[17]  Michael R. Lyu,et al.  Software fault tolerance in a clustered architecture: techniques and reliability modeling , 1999, 1999 IEEE Aerospace Conference. Proceedings (Cat. No.99TH8403).

[18]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[19]  Rayford B. Vaughn,et al.  Decision Making For Network Health Assessment In An Intelligent Intrusion Detection System Architecture , 2004, Int. J. Inf. Technol. Decis. Mak..

[20]  Bronis R. de Supinski,et al.  Dynamic Software Testing of MPI Applications with Umpire , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[22]  Zhen Liu,et al.  A comparison of input representations in neural networks: a case study in intrusion detection , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[23]  Stephanie Forrest,et al.  Operating system stability and security through process homeostasis , 2002 .

[24]  Zhen Liu,et al.  Lightweight monitoring of MPI programs in real time , 2005, Concurr. Comput. Pract. Exp..

[25]  Salvatore J. Stolfo,et al.  Data Mining Approaches for Intrusion Detection , 1998, USENIX Security Symposium.

[26]  Timothy W. Curry,et al.  Profiling and Tracing Dynamic Library Usage Via Interposition , 1994, USENIX Summer.

[27]  Rajkumar Buyya,et al.  PARMON: a portable and scalable monitoring system for clusters , 2000 .

[28]  Armin R. Mikler,et al.  NetPIPE: A Network Protocol Independent Performance Evaluator , 1996 .

[29]  Zhen Liu,et al.  Detecting anomalies in high-performance parallel programs , 2004, International Conference on Information Technology: Coding and Computing, 2004. Proceedings. ITCC 2004..

[30]  Jeff A. Bilmes,et al.  WHAT HMMS CAN'T DO , 2004 .

[31]  Rajkumar Buyya,et al.  PARMON: a portable and scalable monitoring system for clusters , 2000, Softw. Pract. Exp..

[32]  Zhen Liu,et al.  Attacking a High Performance Computer Cluster , 2004 .

[33]  Zhen Liu,et al.  Classification of anomalous traces of privileged and parallel programs by neural networks , 2003, The 12th IEEE International Conference on Fuzzy Systems, 2003. FUZZ '03..

[34]  Stephanie Forrest,et al.  A sense of self for Unix processes , 1996, Proceedings 1996 IEEE Symposium on Security and Privacy.

[35]  Terran Lane,et al.  Hidden Markov Models for Human/Computer Interface Modeling , 1999 .

[36]  Ronald Minnich,et al.  Supermon: a high-speed cluster monitoring system , 2002, Proceedings. IEEE International Conference on Cluster Computing.

[37]  Rayford B. Vaughn,et al.  Fuzzy cognitive maps for decision support in an intelligent intrusion detection system , 2001, Proceedings Joint 9th IFSA World Congress and 20th NAFIPS International Conference (Cat. No. 01TH8569).