Dependence Uniformization: A Loop Parallelization Technique

Data dependence uniformization, a method for overcoming the difficulties in parallelizing a doubly nested loop with irregular dependence constraints is proposed. This approach is based on the concept of vector decomposition. A simple set of basic dependences is developed from which all dependence constraints can be composed. The set of basic dependences is added to every iteration to replace all original dependences so that the dependence constraints become uniform. An efficient synchronization method is presented to obey the uniform dependence constraints in every iteration. >

[1]  P.-C. Yew,et al.  On data synchronization for multiprocessors , 1989, ISCA '89.

[2]  Leslie Lamport,et al.  The parallel execution of DO loops , 1974, CACM.

[3]  Utpal Banerjee,et al.  Dependence analysis for supercomputing , 1988, The Kluwer international series in engineering and computer science.

[4]  Michael Wolfe,et al.  Loops skewing: The wavefront method revisited , 1986, International Journal of Parallel Programming.

[5]  Gregory F. Pfister,et al.  “Hot spot” contention and combining in multistage interconnection networks , 1985, IEEE Transactions on Computers.

[6]  CONSTANTINE D. POLYCHRONOPOULOS,et al.  Guided Self-Scheduling: A Practical Scheduling Scheme for Parallel Supercomputers , 1987, IEEE Transactions on Computers.

[7]  Zhiyu Shen,et al.  An Empirical Study on Array Subscripts and Data Dependencies , 1989, ICPP.

[8]  Zhiyuan Li,et al.  An Efficient Data Dependence Analysis for Parallelizing Compilers , 1990, IEEE Trans. Parallel Distributed Syst..

[9]  Yoichi Muraoka,et al.  On the Number of Operations Simultaneously Executable in Fortran-Like Programs and Their Resulting Speedup , 1972, IEEE Transactions on Computers.

[10]  Robert Henry Kuhn,et al.  Optimization and interconnection complexity for: parallel processors, single-stage networks, and decision trees , 1980 .

[11]  R. A. Brouse Advanced computers , 1961, CACM.

[12]  Ten Hwan Tzen Advanced loop parallelization: dependence uniformization and trapezoid self-scheduling , 1992 .

[13]  Ron Cytron,et al.  Doacross: Beyond Vectorization for Multiprocessors , 1986, ICPP.

[14]  L.M. Ni,et al.  Trapezoid Self-Scheduling: A Practical Scheduling Scheme for Parallel Compilers , 1993, IEEE Trans. Parallel Distributed Syst..

[15]  Peiyi Tang,et al.  Dynamic Processor Self-Scheduling for General Parallel Nested Loops , 1987, IEEE Trans. Computers.

[16]  Lionel M. Ni,et al.  Resource Contention in Shared-Memory Multiprocessors: A Parameterized Performance Degradation Model , 1991, J. Parallel Distributed Comput..

[17]  David A. Padua,et al.  Compiler Algorithms for Synchronization , 1987, IEEE Transactions on Computers.