A Modeling Framework for Integrated Distributed Systems Fault Management

This paper describes a modeling framework for integrated fault management of distributed systems . The model integrates all different layers of distributed systems such as applica­ tion, system, and network layer in a single, consistent view. This enables generic manage­ ment applications to perform their tasks across layer boundaries of the distributed system without knowledge about the specific details. The focus is on fault management issues . Dependencies between resources critical for the availability of the distributed system are modeled using relationships. Generic fault management applications are hereby enabled to determine the root cause of a distributed system failure automatically. The SAP Rj3 application serves as an example to demonstrate the capabilities of the modeling framework.