Toward Formally-Based Design of Message Passing Programs

Presents a systematic approach to the development of message passing programs. Our programming model is SPMD, with communications restricted to collective operations: scan, reduction, gather, etc. The design process in such an architecture-independent language is based on correctness-preserving transformation rules that are provable in a formal functional framework. We develop a set of design rules for composition and decomposition. For example, scan followed by reduction is replaced by a single reduction, and global reduction is decomposed into two faster operations. The impact of the design rules on the target performance is estimated analytically and tested in machine experiments. As a case study, we design two provably correct, efficient programs using the Message Passing Interface (MPI) for the famous maximum segment sum problem, starting from an intuitive, but inefficient, algorithm specification.

[1]  Sergei Gorlatch,et al.  Optimizing Compositions of Scans and Reductions in Parallel Program Derivation , 1997 .

[2]  George Karypis,et al.  Introduction to Parallel Computing , 1994 .

[3]  David B. Skillicorn,et al.  Foundations of parallel programming , 1995 .

[4]  Sergei Gorlatch,et al.  Abstraction and performance in the design of parallel programs: an overview of the SAT approach , 2000, Acta Informatica.

[5]  Murray Cole,et al.  A Monadic Calculus for Parallel Costing of a Functional Language of Arrays , 1997, Euro-Par.

[6]  Ramesh Subramonian,et al.  LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.

[7]  Richard S. Bird,et al.  Lectures on Constructive Functional Programming , 1989 .

[8]  Jon L. Bentley Programming Perls , 1984, CACM.

[9]  Henri E. Bal,et al.  MagPIe: MPI's collective communication operations for clustered wide area systems , 1999, PPoPP '99.

[10]  Christian Lengauer,et al.  Parallel implementations of combinations of broadcast, reduction and scan , 1997, Proceedings of PDSE '97: 2nd International Workshop on Software Engineering for Parallel and Distributed Systems.

[11]  Douglas R. Smith Applications of a Strategy for Designing Divide-and-Conquer Algorithms , 1987, Sci. Comput. Program..

[12]  Wentong Cai,et al.  Calculating Recurrences Using the Bird-Meertens Formalism , 1995, Parallel Process. Lett..

[13]  Robert A. van de Geijn,et al.  On Global Combine Operations , 1994, J. Parallel Distributed Comput..

[14]  S. Doaitse Swierstra,et al.  Virtual Data Structures , 1993, Formal Program Development.

[15]  Sergei Gorlatch,et al.  Systematic Efficient Parallelization of Scan and Other List Homomorphisms , 1996, Euro-Par, Vol. II.

[16]  Murray Cole,et al.  Algorithmic skeletons : a structured approach to the management of parallel computation , 1988 .

[17]  Murray Cole,et al.  Algorithmic Skeletons: Structured Management of Parallel Computation , 1989 .

[18]  Henri E. Bal,et al.  Bandwidth-efficient collective communication for clustered wide area systems , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.

[19]  Masato Takeichi,et al.  Formal derivation of efficient parallel programs by construction of list homomorphisms , 1997, TOPL.

[20]  Sergei Gorlatch,et al.  Skeletons and Transformations in an Integrated Parallel Programming Environment , 1999, PaCT.

[21]  David B. Skillicorn,et al.  Models and languages for parallel computation , 1998, CSUR.

[22]  Salvatore Orlando,et al.  P3 L: A structured high-level parallel language, and its structured support , 1995, Concurr. Pract. Exp..

[23]  Jon Louis Bentley,et al.  Programming pearls , 1987, CACM.

[24]  Xiaotie Deng,et al.  Good algorithm design style for multiprocessors , 1994, Proceedings of 1994 6th IEEE Symposium on Parallel and Distributed Processing.

[25]  Sergei Gorlatch,et al.  Optimization rules for programming with collective operations , 1999, Proceedings 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing. IPPS/SPDP 1999.

[26]  Gudula Rünger,et al.  A Methodology for Deriving Parallel Programs with a Family of Parallel Abstract Machines , 1997, Euro-Par.

[27]  Murray Cole,et al.  Parallel Programming with List Homomorphisms , 1995, Parallel Process. Lett..

[28]  Henri E. Bal,et al.  Sensitivity of parallel applications to large differences in bandwidth and latency in two-layer interconnects , 1999, Proceedings Fifth International Symposium on High-Performance Computer Architecture.

[29]  Peter S. Pacheco Parallel programming with MPI , 1996 .

[30]  Robert A. van de Geijn,et al.  Using PLAPACK - parallel linear algebra package , 1997 .

[31]  Richard S. Bird,et al.  Algebraic Identities for Program Calculation , 1989, Comput. J..

[32]  R. A. van de Geijn,et al.  Efficient Global Combine Operations , 1991 .

[33]  William Gropp,et al.  Skjellum using mpi: portable parallel programming with the message-passing interface , 1994 .

[34]  G. C. Fox,et al.  Solving Problems on Concurrent Processors , 1988 .