Optimal Compilation of HPF Remappings

Applications with varying array access patterns require to dynamically change array mappings on distributed-memory parallel machines.hpf(High Performance Fortran) provides such remappings explicitly throughrealignandredistributedirectives and implicitly at procedure calls and returns. However, such features are left out ofhpf2.0 for efficiency reasons. This paper presents a new technique for compilinghpfremappings onto message-passing parallel architectures. First, useless remappings that appear naturally are removed. Second, thespmdgenerated code takes advantage of replication to shorten the remapping time. Communication is proved optimal: a minimal number of messages, containing only the required data, is sent over the network. The technique is fully implemented in ourhpfcompiler and was experimented on adecAlpha farm.

[1]  John R. Gilbert,et al.  Array Distribution in Data-Parallel Programs , 1994, LCPC.

[2]  Lionel M. Ni,et al.  Processor Mapping Techniques Toward Efficient Data Redistribution , 1995, IEEE Trans. Parallel Distributed Syst..

[3]  Jack J. Dongarra,et al.  Recent Enhancements To Pvm , 1995, Int. J. High Perform. Comput. Appl..

[4]  Monica S. Lam,et al.  Communication optimization and code generation for distributed memory machines , 1993, PLDI '93.

[5]  Ken Kennedy,et al.  Automatic data layout for distributed-memory machines , 1998, TOPL.

[6]  Tsunehiko Kamachi,et al.  Generating realignment-based communication for HPF programs , 1996, Proceedings of International Conference on Parallel Processing.

[7]  Vincent Van Dongen Compiling Distributed Loops onto SPMD Code , 1994, Parallel Process. Lett..

[8]  Fabien Coelho Compilation of I/O communications for HPF , 1995, Proceedings Frontiers '95. The Fifth Symposium on the Frontiers of Massively Parallel Computation.

[9]  Peter Brezany,et al.  Vienna Fortran - A Language Specification. Version 1.1 , 1992 .

[10]  Pierre Jouvelot,et al.  Semantical interprocedural parallelization: an overview of the PIPS project , 1991 .

[11]  Sandeep K. S. Gupta,et al.  On Compiling Array Expressions for Efficient Execution on Distributed-Memory Machines , 1993, 1993 International Conference on Parallel Processing - ICPP'93.

[12]  Ken Kennedy,et al.  Advanced compilation techniques for fortran d , 1993 .

[13]  Bernard Tourancheau,et al.  Efficient Block Cyclic Data Redistribution , 1996, Euro-Par, Vol. I.

[14]  Gary A. Kildall,et al.  A unified approach to global program optimization , 1973, POPL.

[15]  In and out Array Region Analyses , .

[16]  Ken Kennedy,et al.  Automatic Data Layout for High Performance Fortran , 1995, SC.

[17]  P. Feautrier Parametric integer programming , 1988 .

[18]  Ken Kennedy,et al.  A linear-time algorithm for computing the memory access sequence in data-parallel programs , 1995, PPOPP '95.

[19]  François Irigoin,et al.  Interprocedural Array Region Analyses , 1996, International Journal of Parallel Programming.

[20]  Rajeev Thakur,et al.  Efficient Algorithms for Array Redistribution , 1996, IEEE Trans. Parallel Distributed Syst..

[21]  Corinne Ancourt,et al.  A Linear Algebra Framework for Static High Performance Fortran Code Distribution , 1997, Sci. Program..

[22]  Rice UniversityCORPORATE,et al.  High performance Fortran language specification , 1993 .

[23]  Sandeep K. S. Gupta,et al.  Implementing Fast Fourier Transforms on Distributed-Memory Multiprocessors Using Data Redistributions , 1994, Parallel Process. Lett..

[24]  Guy L. Steele,et al.  The High Performance Fortran Handbook , 1993 .

[25]  Arjan J. C. van Gemund,et al.  Automatic Parallel Program Generation and Optimization from Data Decompositions , 1991, ICPP.

[26]  Thomas R. Gross,et al.  Generating Communication for Array Statement: Design, Implementation, and Evaluation , 1994, J. Parallel Distributed Comput..

[27]  Peter Brezany,et al.  Processing Array Statements and Procedure Interfaces in the PREPARE High Performance Fortran Compiler , 1994, CC.

[28]  John R. Gilbert,et al.  Generating local addresses and communication sets for data-parallel programs , 1993, PPOPP '93.

[29]  Geoffrey C. Fox,et al.  Runtime array redistribution in HPF programs , 1994, Proceedings of IEEE Scalable High Performance Computing Conference.

[30]  Vincent Van Dongen Array Redistribution by Scanning Polyhedra , 1995, PARCO.

[31]  Prithviraj Banerjee,et al.  Automatic generation of efficient array redistribution routines for distributed memory multicomputers , 1995, Proceedings Frontiers '95. The Fifth Symposium on the Frontiers of Massively Parallel Computation.

[32]  William Pugh,et al.  A practical algorithm for exact array dependence analysis , 1992, CACM.

[33]  Corinne Ancourt,et al.  Scanning polyhedra with DO loops , 1991, PPOPP '91.

[34]  Sanjay Ranka,et al.  Personalized Communication Avoiding Node Contention on Distributed Memory Systems , 1993, 1993 International Conference on Parallel Processing - ICPP'93.

[35]  Jack Dongarra,et al.  Pvm 3 user's guide and reference manual , 1993 .

[36]  Martine Ancourt Generation automatique de codes de transfert pour multiprocesseurs a memoires locales , 1991 .