Simplifying Control Flow in Compiler-Generated Parallel Code

Complex conditional control-ow can be an important source of overhead in compiler-generated parallel code for data-parallel programs. Optimizing compilers for data-parallel languages such as High Performance Fortran (HPF) perform a complex sequence of transformations to optimize the performance of the generated code. It would be complicated and expensive for each transformation step to fully account for the results of all others. Thus, one transformation may introduce conditional control-ow testing predicates on integer values that a later transformation may render unnecessary. Here we describe a pair of algorithms that compute symbolic constraints on values of integer variables and use them to remove such unnecessary conditional control ow. These algorithms have been implemented in the Rice dHPF compiler. We show that these algorithms are eeective in reducing the number of conditionals, code size, and overall execution time for code generated by dHPF. Finally, we describe a synergy between control ow simpliication and code generation based on loop splitting that achieves the eeects of more narrow optimizations such as vector message pipelining and the use of overlap areas.

[1]  William H. Harrison,et al.  Compiler Analysis of the Value Ranges for Variables , 1977, IEEE Transactions on Software Engineering.

[2]  Paul Havlak,et al.  Interprocedural symbolic analysis , 1995 .

[3]  Mark N. Wegman,et al.  Efficiently computing static single assignment form and the control dependence graph , 1991, TOPL.

[4]  K. Cooper,et al.  Using Conditional Branches to Improve Constant Propagation Using Conditional Branches to Improve Constant Propagation , 1995 .

[5]  François Bourdoncle,et al.  Abstract debugging of higher-order imperative languages , 1993, PLDI '93.

[6]  William Pugh,et al.  The Omega Library interface guide , 1995 .

[7]  Rajiv Gupta,et al.  Interprocedural conditional branch elimination , 1997, PLDI '97.

[8]  Rudolf Eigenmann,et al.  Demand-Driven, Symbolic Range Propagation , 1995, LCPC.

[9]  Michael Wolfe,et al.  Elimination of redundant array subscript range checks , 1995, PLDI '95.

[10]  Jack Dongarra,et al.  MPI: The Complete Reference , 1996 .

[11]  Monica S. Lam,et al.  Communication optimization and code generation for distributed memory machines , 1993, PLDI '93.

[12]  W. Kelly,et al.  Code generation for multiple mappings , 1995, Proceedings Frontiers '95. The Fifth Symposium on the Frontiers of Massively Parallel Computation.

[13]  Chau-Wen Tseng An optimizing Fortran D compiler for MIMD distributed-memory machines , 1993 .

[14]  John A. Chandy,et al.  The Paradigm Compiler for Distributed-Memory Multicomputers , 1995, Computer.

[15]  S LamMonica,et al.  Communication optimization and code generation for distributed memory machines , 1993 .

[16]  Vikram S. Adve,et al.  Using integer sets for data-parallel program analysis and optimization , 1998, PLDI.

[17]  Larry Meadows,et al.  Compiling High Performance Fortran , 1995, PPSC.

[18]  Michael Gerndt,et al.  Automatic parallelization for distributed-memory multiprocessing systems , 1989 .

[19]  David B. Whalley,et al.  Avoiding conditional branches by code replication , 1995, PLDI '95.

[20]  Charles Koelbel,et al.  Compiling Global Name-Space Parallel Loops for Distributed Execution , 1991, IEEE Trans. Parallel Distributed Syst..

[21]  Anne Rogers,et al.  Process decomposition through locality of reference , 1989, PLDI '89.

[22]  Michael Gerndt,et al.  Updating Distributed Variables in Local Computations , 1990, Concurr. Pract. Exp..

[23]  Edith Schonberg,et al.  An HPF Compiler for the IBM SP2 , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[24]  David A. Padua,et al.  Gated SSA-based demand-driven symbolic analysis for parallelizing compilers , 1995, ICS '95.

[25]  William Pugh,et al.  A practical algorithm for exact array dependence analysis , 1992, CACM.