Toward Compiler Support for Scalable Parallelism Using Multipartitioning

Strategies for partitioning an application's data play a fundamental role in determining the range of possible parallelizations that can be performed and ultimately their potential efficiency. This paper describes extensions to the Rice dHPF compiler for High Performance Fortran which enable it to support data distributions based on multipartitioning. Using these distributions can help close the substantial gap between the efficiency and scalability of compiler-parallelized codes for multi-directional line sweep computations and their hand-coded counterparts. We describe our the design and implementation of compiler support for multipartitioning and show preliminary results for a benchmark compiled using these techniques.

[1]  Larry Meadows,et al.  Compiling High Performance Fortran , 1995, PPSC.

[2]  Sandeep K. S. Gupta,et al.  On Compiling Array Expressions for Efficient Execution on Distributed-Memory Machines , 1993, 1993 International Conference on Parallel Processing - ICPP'93.

[3]  Rob F. Van der Wijngaart Efficient implementation of a 3-dimensional ADI method on the iPSC/860 , 1993, SC.

[4]  David H. Bailey,et al.  The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..

[5]  Michael J. Quinn,et al.  Data-parallel programming on a network of heterogeneous workstations , 1993, Concurr. Pract. Exp..

[6]  Monica S. Lam,et al.  Communication optimization and code generation for distributed memory machines , 1993, PLDI '93.

[7]  Ken Kennedy,et al.  A model and compilation strategy for out-of-core data parallel programs , 1995, PPOPP '95.

[8]  Vijay K. Naik Performance Effects of Load Imbalance in Parallel CFD Applications , 1991, PPSC.

[9]  Vikram S. Adve,et al.  Using integer sets for data-parallel program analysis and optimization , 1998, PLDI.

[10]  Keshav Pingali,et al.  Compiler and run-time support for semi-structured applications , 1997, ICS '97.

[11]  Pankaj Mehra,et al.  Performance measurement, visualization and modeling of parallel and distributed programs using the AIMS toolkit , 1995, Softw. Pract. Exp..

[12]  P. R. Cappello,et al.  Implementing the beam and warming method on the hypercube , 1989, C3P.

[13]  Barbara M. Chapman,et al.  Extending HPF for Advanced Data-Parallel Applications , 1994, IEEE Parallel & Distributed Technology: Systems & Applications.

[14]  V. K. Naik Scalability issues for a class of CFD applications , 1992, Proceedings Scalable High Performance Computing Conference SHPCC-92..

[15]  Vijay K. Naik,et al.  A Scalable Implementation of the NAS Parallel Benchmark BT on Distributed Memory Systems , 1995, IBM Syst. J..

[16]  Vikram S. Adve,et al.  High Performance Fortran Compilation Techniques for Parallelizing Scientific Codes , 1998, Proceedings of the IEEE/ACM SC98 Conference.

[17]  Michael Gerndt,et al.  Updating Distributed Variables in Local Computations , 1990, Concurr. Pract. Exp..

[18]  Vijay K. Naik,et al.  Parallelization of a Class of Implicit Finite Difference Schemes in Computational Fluid Dynamics , 1993, Int. J. High Speed Comput..

[19]  R. F. Van der Wijngaart Efficient implementation of a 3-dimensional ADI method on the iPSC/860 , 1993 .