Data Flow Analysis Driven Dynamic Data Partitioning

The use of distributed memory architectures as an effective approach to parallel computing brings with it a more complex program development process. Finding a partitioning of program code and data that supports sufficient parallelism without incurring prohibitive communication costs is a challenging and critical step in the development of programs for distributed memory systems. Automatic data distribution techniques have the goal of placing the responsibility of determining a suitable data partitioning into the domain of the compiler. Static program analysis techniques that expose data interrelationships and derive performance estimates are central to the development of automatic data distribution heuristics. In this paper we present a data partitioning heuristic that makes use of array data flow analysis information in the modeling of data interrelationships and the estimation of costs associated with resolving interrelationships via communication. The global view provided by data flow analysis permits consideration of potential communication optimizations before data partitioning decisions are made. Our heuristic uses tiling techniques to determine data partitionings. The resulting data distributions, while still regular, are not limited to the standard BLOCK, CYCLIC and BLOCK-CYCLIC varieties. Preliminary results indicate an overall reduction in communication cost with our technique.

[1]  Monica S. Lam,et al.  Global optimizations for parallelism and locality on scalable parallel machines , 1993, PLDI '93.

[2]  Prithviraj Banerjee,et al.  Automatic Selection of Dynamic Data Partitioning Schemes for Distributed-Memory Multicomputers , 1995, LCPC.

[3]  Thomas R. Gross,et al.  Structured dataflow analysis for arrays and its use in an optimizing compiler , 1990, Softw. Pract. Exp..

[4]  Mark N. Wegman,et al.  Efficiently computing static single assignment form and the control dependence graph , 1991, TOPL.

[5]  Manish Gupta,et al.  Demonstration of Automatic Data Partitioning Techniques for Parallelizing Compilers on Multicomputers , 1992, IEEE Trans. Parallel Distributed Syst..

[6]  Ken Kennedy,et al.  Automatic Data Layout Using 0-1 Integer Programming , 1994, IFIP PACT.

[7]  Eduard Ayguadé,et al.  Data Redistribution in an Automatic Data Distribution Tool , 1995, LCPC.

[8]  Ulrich Kremer,et al.  NP-completeness of Dynamic Remapping , 1993 .

[9]  Barbara M. Chapman,et al.  Automatic Support for Data Distribution on Distributed Memory Multiprocessor Systems , 1993, LCPC.

[10]  Monica S. Lam,et al.  An Overview of a Compiler for Scalable Parallel Machines , 1993, LCPC.

[11]  Rami G. Melhem,et al.  An Array Data Flow Analysis Based Communication Optimizer , 1997, LCPC.

[12]  Guy L. Steele,et al.  Data Optimization: Allocation of Arrays to Reduce Communication on SIMD Machines , 1990, J. Parallel Distributed Comput..

[13]  John R. Gilbert,et al.  Array Distribution in Data-Parallel Programs , 1994, LCPC.

[14]  Keshav Pingali,et al.  Dependence-based program analysis , 1993, PLDI '93.

[15]  Mahmut T. Kandemir,et al.  Compiler algorithms for optimizing locality and parallelism on shared and distributed memory machines , 1997, Proceedings 1997 International Conference on Parallel Architectures and Compilation Techniques.

[16]  Saman Amarasinghe,et al.  The suif compiler for scalable parallel machines , 1995 .

[17]  P. Sadayappan,et al.  Nested Loop Tiling for Distributed Memory Machines , 1990, Proceedings of the Fifth Distributed Memory Computing Conference, 1990..