A Structured Approach to Instrumentation System Development and Evaluation

Software instrumentation is a widely used technique for parallel program performance evaluation, debugging, steering, and visualization. With increasing sophistication of parallel tool development technologies and broadening of application areas where these tools are being used, runtime data collection and management activities are growing in importance; we use the term instrumentation system (IS) to refer to components that support these activities in state-of-the-art parallel tool environments. An IS consists of Local Instrumentation Servers, an Instrumentation System Manager, and a Transfer Protocol. The overheads and perturbation effects attributed to an IS must be accounted for to ensure correct and efficient representation of program behavior, especially for on-line and real-time environments. Moreover, an IS is a key facilitator of integration of tools in an environment. In this paper, we define the primary components of an IS and their roles in an integrated environment, and classify ISs according to selected features. We introduce a structured approach to plan, design, model, evaluate, implement, and validate an IS. The approach provides a means to formally address domain-specific requirements. The modeling and evaluation processes are illustrated in the context of three distinctive IS case studies for PICL, Paradyn, and Vista. Valuable feedback on performance effects of IS parameters and policies can assist developers in making design decisions early in the software development cycle. Additionally, use of structured software engineering methods can support the mapping of an abstract IS model to an implementation of the IS.

[1]  Rebecca Koskela,et al.  Performance instrumentation and visualization , 1990 .

[2]  Jerry C. Yan Performance Tuning with AIMS - An Automated Instrumentation and Monitoring System for Multicomputers , 1994, HICSS.

[3]  Rebecca Koskela,et al.  Parallel Computer Systems: Performance Instrumentation and Visualization , 1990 .

[4]  Averill M. Law,et al.  Simulation Modeling and Analysis , 1982 .

[5]  Barton P. Miller,et al.  An Adaptive Cost System for Parallel Program Instrumentation , 1996, Euro-Par, Vol. I.

[6]  Barton P. Miller,et al.  The Paradyn Parallel Performance Measurement Tool , 1995, Computer.

[7]  Diane T. Rover Performance evaluation: integrating techniques and tools into environments and frameworks , 1994, Proceedings of Supercomputing '94.

[8]  James E. Mankovich,et al.  Extensible parallel program performance visualization , 1995, MASCOTS '95. Proceedings of the Third International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems.

[9]  D. G. Fisher,et al.  Introduction to Queuing Networks , 1989 .

[10]  Abdul Waheed,et al.  A model for instrumentation system management in concurrent computer systems , 1995, Proceedings of the Twenty-Eighth Annual Hawaii International Conference on System Sciences.

[11]  Barton P. Miller,et al.  Dynamic control of performance monitoring on large scale parallel systems , 1993, ICS '93.

[12]  Abdul Waheed,et al.  A toolkit for advanced performance analysis , 1994, Proceedings of International Workshop on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.

[13]  ACM-Sigmetrics 1994 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems : proceedings, May 16-20, 1994, Vanderbilt University, Nashville, Tennessee, USA , 1994 .

[14]  Michael T. Heath,et al.  Visualizing the performance of parallel programs , 1991, IEEE Software.

[15]  David J. Goodman,et al.  Personal Communications , 1994, Mobile Communications.

[16]  Devesh Bhatt,et al.  SPI: an instrumentation development environment for parallel/distributed systems , 1995, Proceedings of 9th International Parallel Processing Symposium.

[17]  B. Miller,et al.  ----An Adaptive Cost Model for Parallel Program Instrumentation , 1994 .

[18]  J. C. Yan,et al.  Performance tuning with AIMS/spl minus/an Automated Instrumentation and Monitoring System for multicomputers , 1994, 1994 Proceedings of the Twenty-Seventh Hawaii International Conference on System Sciences.

[19]  Abdul Waheed,et al.  VIZIR: an integrated environment for distributed program visualization , 1995, MASCOTS '95. Proceedings of the Third International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems.

[20]  Jean Walrand An introduction to queuing networks , 1988 .

[21]  Karsten Schwan,et al.  Falcon: on-line monitoring and steering of large-scale parallel programs , 1995, Proceedings Frontiers '95. The Fifth Symposium on the Frontiers of Massively Parallel Computation.

[22]  Guy Pujolle,et al.  Introduction to queueing networks , 1987 .

[23]  Leonard Kleinrock,et al.  Collecting unused processing capacity: an analysis of transient distributed systems , 1989, [1989] Proceedings. The 9th International Conference on Distributed Computing Systems.

[24]  Karsten Schwan,et al.  Application-Dependent Dynamic Monitoring of Distributed and Parallel Systems , 1993, IEEE Trans. Parallel Distributed Syst..

[25]  Allen D. Malony,et al.  Performance Measurement Intrusion and Perturbation Analysis , 1992, IEEE Trans. Parallel Distributed Syst..

[26]  S. Resnick Adventures in stochastic processes , 1992 .

[27]  Abdul Waheed,et al.  A model for instrumentation system management in concurrent computer systems , 1995, HICSS.

[28]  W. Auld,et al.  The Paragon Performance Monitoring Environment , 1993, ACPC.

[29]  Karsten Schwan,et al.  ChaosMON—application-specific monitoring and display of performance information for parallel and distributed systems , 1991, PADD '91.

[30]  A. Malony,et al.  Implementing a parallel C++ runtime system for scalable parallel systems , 1993, Supercomputing '93.

[31]  B. Miller,et al.  The Paradyn Parallel Performance Measurement Tools , 1995 .

[32]  Raj Jain,et al.  The art of computer systems performance analysis - techniques for experimental design, measurement, simulation, and modeling , 1991, Wiley professional computing.

[33]  Jack Dongarra,et al.  Program analysis environments for parallel language systems: the tau environment , 1994 .