Interoperable tools based on OMIS

On-line tools for development and maintenance of parallel and distributed programs are a precondition for efficient software implementation in this field. The number of available tools is limited as they require complex software infrastructures to perform their tasks. A monitoring system must, be integrated into the structure built by application processes, libraries and system processes. It is responsible for observation and manipulation of the parallel program and for event detection in the program run. The complexity of the monitoring system stems from its mediator position. It has to be integrated into almost any software parts of the computer system and partly needs access to its hardware. As a consequence we do not find any two monitoring systems for on-line tools that are compatible in the sense that they can be used concurrently. This implies that no two different tools can be used at the same time. However, it is desirable to run combinations of on-line tools as e.g. a debugger and a checkpointing system, or a performance analyzer and a load balancing facility performing process migration. The concept of concurrent usage of tools is called interoperable tools. Three degrees of interoperability can be distinguished. At the first level, tools just co-exist but are not aware of each other. Inconsistencies might occur when one of them manipulates the program without the others taking notice of this. The second level covers tools which co-exist and coordinate their work. They are aware of other tools’ activities by being able to observe these activities. The third level offers fully cooperating tools which communicate by some specific mechanism. The three levels of interoperability require different t,ypes of infrastructures to be provided in order to support the tools appropriately. At level one, we need a common monitoring system where more than one tool can be attached to at a time. Level two requires facilities at the monitoring system’s level by which a tool can be informed about activities of other tools. Mainly manipulation operations are of interest as they influence the state of the program system. With level three we need a direct tool communication mechanism, which consists of a protocol specification and a flexible and