An Open Interface for the On-Line Monitoring of Parallel and Distributed Programs

The on-line monitoring interface specification (OMIS) pro vides means for developing more powerful interoperable and portable tool environments for parallel and distributed systems. It specifies the interaction between any tool and a monitoring system that is responsible for observing and manipulating the programs' execution. This well- defined interface makes it possible to concurrently use several tools of possibly different developers with the same program run and to port tools onto various target architec tures and software environments. As a starting point, the research group at LRR-TUM is designing an OMIS com pliant monitoring system (OCM) for Parallel Virtual Machines (PVM) to run on workstation clusters. Tool de velopers can use this implementation to attach their own on-line tools to the system.

[1]  Calmet Meteorological Model A User's Guide for the , 1999 .

[2]  Michael Oberhuber,et al.  OCM - An OMIS Compliant Monitoring System , 1996, PVM.

[3]  Thomas Ludwig,et al.  The Tool-Set - An Integrated Tool Envrionment for PVM , 1996, HPCN Europe.

[4]  Michael Oberhuber,et al.  Interactive Debugging and Performance Analysis of Massively Parallel Applications , 1996, Parallel Comput..

[5]  Barton P. Miller,et al.  The Paradyn Parallel Performance Measurement Tool , 1995, Computer.

[6]  Thomas Ludwig,et al.  PFSLib - A Parallel File System for Workstation Clusters , 1995, PaCT.

[7]  Dieter Kranzlmüller,et al.  Debugging parallel programs using ATEMPT , 1995, HPCN Europe.

[8]  S. K. Damodaran-Kamal Towards heterogeneous distributed debugging , 1995 .

[9]  J. Krammer,et al.  A scalable performance analysis tool for PowerPC based MPP systems , 1995, Proceedings the First Aizu International Symposium on Parallel Algorithms/Architecture Synthesis.

[10]  P. Kennedy,et al.  1994 Northern Goshawk inventory on portions of Los Alamos National Laboratory, Los Alamos, NM , 1995 .

[11]  Rainer Pollak A Hierarchical Load Balancing Environment for Parallel and Distributed Supercomputer , 1995 .

[12]  Josef Fritscher,et al.  Visualization, Execution Control and Replay of Massively Parallel Programs within Annai's Debugging Tool , 1995 .

[13]  B. Miller,et al.  The Paradyn Parallel Performance Measurement Tools , 1995 .

[14]  Robert Hood,et al.  A portable debugger for parallel and distributed programs , 1994, Proceedings of Supercomputing '94.

[15]  Don Allen,et al.  A scalable debugger for massively parallel message-passing programs , 1994, IEEE Parallel & Distributed Technology: Systems & Applications.

[16]  Daniel A. Reed,et al.  Experimental Analysis of Parallel Systems: Techniques and Open Problems , 1994, Computer Performance Evaluation.

[17]  Don Allen,et al.  A scalable debugger for massively parallel message-passing programs , 1994, IEEE Parallel & Distributed Technology: Systems & Applications.

[18]  Joan M. Francioni,et al.  Nondeterminancy: testing and debugging in message passing parallel programs , 1993, PADD '93.

[19]  A. Beguelin Xab: a tool for monitoring PVM programs , 1993, [1993] Proceedings of the Twenty-sixth Hawaii International Conference on System Sciences.

[20]  Bernd Brügge A portable platform for distributed event environments , 1991, PADD '91.

[21]  G. A. Geist,et al.  A user's guide to PICL a portable instrumented communication library , 1990 .

[22]  Dan C. Marinescu,et al.  Models for Monitoring and Debugging Tools for Parallel and Distributed Software , 1990, J. Parallel Distributed Comput..

[23]  Dan C. Marinescu,et al.  Specification and identification of events for debugging and performance monitoring of distributed multiprocessor systems , 1990, Proceedings.,10th International Conference on Distributed Computing Systems.

[24]  Jason Gait,et al.  A probe effect in concurrent programs , 1986, Softw. Pract. Exp..