Poems: end-to-end performance design of large parallel adaptive computational systems

The POEMS project is creating an environment for end-to-end performance modeling of complex parallel and distributed systems, spanning the domains of application software, runtime and operating system software, and hardware architecture. Toward this end, the POEMS framework supports composition of component models from these different domains into an end-to-end system model. This composition can be specified using a generalized graph model of a parallel system, together with interface specifications that carry information about component behaviors and evaluation methods. The POEMS Specification Language compiler will generate an end-to-end system model automatically from such a specification. The components of the target system may be modeled using different modeling paradigms and at various levels of detail. Therefore, evaluation of a POEMS end-to-end system model may require a variety of evaluation tools including specialized equation solvers, queuing network solvers, and discrete event simulators. A single application representation based on static and dynamic task graphs serves as a common workload representation for all these modeling approaches. Sophisticated parallelizing compiler techniques allow this representation to be generated automatically for a given parallel program. POEMS includes a library of predefined analytical and simulation component models of the different domains and a knowledge base that describes performance properties of widely used algorithms. The paper provides an overview of the POEMS methodology and illustrates several of its key components. The modeling capabilities are demonstrated by predicting the performance of alternative configurations of Sweep3D, a benchmark for evaluating wavefront application technologies and high-performance, parallel architectures.

[1]  Thomas Phan,et al.  Performance prediction of large parallel applications using parallel simulations , 1999, PPoPP '99.

[2]  David W. Binkley,et al.  Interprocedural slicing using dependence graphs , 1990, TOPL.

[3]  John R. Rice,et al.  Recommender Systems for Problem Solving Environments , 1997 .

[4]  Yong Luo,et al.  A factorial performance evaluation for hierarchical memory systems , 1999, Proceedings 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing. IPPS/SPDP 1999.

[5]  A. Booth Numerical Methods , 1957, Nature.

[6]  Gio Wiederhold,et al.  Mediation in information systems , 1995, CSUR.

[7]  Rajive L. Bagrodia,et al.  Maisie: A Language for the Design of Efficient Discrete-Event Simulations , 1994, IEEE Trans. Software Eng..

[8]  Stephen J. Mellor,et al.  Object lifecycles: modeling the world in states , 1992 .

[9]  Mary K. Vernon,et al.  LoPC: modeling contention in parallel algorithms , 1997, PPOPP '97.

[10]  Mary K. Vernon,et al.  Predictive analysis of a wavefront application using LogGP , 1999, PPoPP '99.

[11]  James C. Browne,et al.  The CODE 2.0 graphical parallel programming language , 1992, ICS '92.

[12]  Mineo Takai,et al.  Parssec: A Parallel Simulation Environment for Complex Systems , 1998, Computer.

[13]  Bryan Bayerdorffer,et al.  Distributed programming with associative broadcast , 1994, 1994 Proceedings of the Twenty-Seventh Hawaii International Conference on System Sciences.

[14]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[15]  Mary K. Vernon,et al.  SARA (System ARchitects Apprentice): Modeling, analysis, and simulation support for design of concurrent systems , 1986, IEEE Transactions on Software Engineering.

[16]  Yong Luo,et al.  Performance Evaluation of the SGI Origin2000: A Memory-Centric Characterization of LANL ASCI Applications , 1997, ACM/IEEE SC 1997 Conference (SC'97).

[17]  Sarita V. Adve,et al.  The impact of instruction-level parallelism on multiprocessor performance and simulation methodology , 1997, Proceedings Third International Symposium on High-Performance Computer Architecture.

[18]  Mary K. Vernon,et al.  The influence of random delays on parallel execution times , 1993, SIGMETRICS '93.

[19]  Graham R. Nudd,et al.  PACE: A Toolset to Investigate and Predict Performance in Parallel Systems , 1996 .

[20]  Mary K. Vernon,et al.  An accurate and efficient performance analysis technique for multiprocessor snooping cache-consistency protocols , 1988, ISCA '88.

[21]  Sarita V. Adve,et al.  RSIM Reference Manual: Version 1.0 , 1997 .

[22]  Rajive L. Bagrodia,et al.  MPI-SIM: using parallel simulation to evaluate MPI programs , 1998, 1998 Winter Simulation Conference. Proceedings (Cat. No.98CH36274).

[23]  Rajive L. Bagrodia,et al.  Parallel Simulation of Data Parallel Programs , 1995, LCPC.

[24]  Vikram S. Adve,et al.  Analyzing the behavior and performance of parallel programs , 1993 .

[25]  Bryan Carl Bayerdorffer,et al.  Associative broadcast and the communication semantics of naming in concurrent systems , 1993 .

[26]  Adolfy Hoisie,et al.  Performance and Scalability Analysis of Teraflop-Scale Parallel Architectures Using Multidimensional Wavefront Applications , 2000, Int. J. High Perform. Comput. Appl..

[27]  R. Bagrodia,et al.  Parallel Simulation of Parallel File Systems and I/O Programs , 1997, ACM/IEEE SC 1997 Conference (SC'97).

[28]  Anoop Gupta,et al.  SPLASH: Stanford parallel applications for shared-memory , 1992, CARN.

[29]  Ramesh Subramonian,et al.  LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.

[30]  Chris J. Scheiman,et al.  LogGP: incorporating long messages into the LogP model—one step closer towards a realistic model for parallel computation , 1995, SPAA '95.

[31]  Alan L. Cox,et al.  TreadMarks: shared memory computing on networks of workstations , 1996 .

[32]  Ivar Jacobson,et al.  The Unified Modeling Language User Guide , 1998, J. Database Manag..

[33]  Anoop Gupta,et al.  Complete computer system simulation: the SimOS approach , 1995, IEEE Parallel Distributed Technol. Syst. Appl..

[34]  Rizos Sakellariou,et al.  Application Representations for Multiparadigm Performance Modeling of Large-Scale Parallel Scientific Codes , 2000, Int. J. High Perform. Comput. Appl..

[35]  Vikram S. Adve,et al.  Compiler-supported simulation of highly scalable parallel applications , 1999, ACM/IEEE SC 1999 Conference (SC'99).

[36]  Vikram S. Adve,et al.  Using integer sets for data-parallel program analysis and optimization , 1998, PLDI.

[37]  Rizos Sakellariou,et al.  Application representations for a multi-paradigm performance modeling environment for parallel syste , 2000 .

[38]  John R. Rice,et al.  Numerical methods, software, and analysis , 1983 .

[39]  William E. Lorensen,et al.  Object-Oriented Modeling and Design , 1991, TOOLS.

[40]  Todd M. Austin,et al.  The SimpleScalar tool set, version 2.0 , 1997, CARN.

[41]  BurgerDoug,et al.  The SimpleScalar tool set, version 2.0 , 1997 .