Exploitation of parallelism to nested loops with dependence cycles

In this paper, we analyze the recurrences from the breakability of the dependence links formed in general multi-statements in a nested loop. The major findings include: (1) A sink variable renaming technique, which can reposition an undesired anti-dependence and/or output-dependence link, is capable of breaking an anti-dependence and/or output-dependence link. (2) For recurrences connected by only true dependences, a dynamic dependence concept and the derived technique are powerful in terms of parallelism exploitation. (3) By the employment of global dependence testing, link-breaking strategy, Tarjan's depth-first search algorithm, and a topological sorting, an algorithm for resolving a general multi-statement recurrence in a nested loop is proposed. Experiments with benchmark cited from Vector loops showed that among 134 subroutines tested, 3 had their parallelism exploitation amended by our proposed method. That is, our offered algorithm increased the rate of parallelism exploitation of Vector loops by approximately 2.24%.

[1]  Ken Kennedy,et al.  Automatic translation of FORTRAN programs to vector form , 1987, TOPL.

[2]  Marion Kee,et al.  Analysis , 2004, Machine Translation.

[3]  Robert E. Tarjan,et al.  Depth-First Search and Linear Graph Algorithms , 1972, SIAM J. Comput..

[4]  Chih-Ping Chu,et al.  A Polynomial-Time Dependence Test for Determining Integer-Valued Solutions in Multi-Dimensional Arrays Under Variable Bounds , 2005, The Journal of Supercomputing.

[5]  Jack J. Dongarra,et al.  A comparative study of automatic vectorizing compilers , 1991, Parallel Comput..

[6]  David A. Padua,et al.  Dependence graphs and compiler optimizations , 1981, POPL '81.

[7]  Minyi Guo,et al.  The non-continuous direction vector I test , 2004, 7th International Symposium on Parallel Architectures, Algorithms and Networks, 2004. Proceedings..

[8]  Chih-Ping Chu,et al.  The infinity Lambda test: A multi-dimensional version of Banerjee infinity test , 2000, Parallel Comput..

[9]  Weijia Shang,et al.  On Loop Transformations for Generalized Cycle Shrinking , 1994, IEEE Trans. Parallel Distributed Syst..

[10]  Chih-Ping Chu,et al.  A simple and general approach to parallelize loops with arbitrary control flow and uniform data dependence distances , 2002, J. Syst. Softw..

[11]  David Thomas,et al.  The Art in Computer Programming , 2001 .

[12]  Randolph E. Harr,et al.  Efficient pipelining of nested loops: unroll-and-squash , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[13]  Aart J. C. Bik,et al.  Efficient Exploitation of Parallelism on Pentium III and Pentium 4 Processor-Based Systems , 2001 .

[14]  Yves Robert,et al.  Plugging Anti and Output Dependence Removal Techniques Into Loop Parallelization Algorithm , 1997, Parallel Comput..

[15]  Rudolf Eigenmann,et al.  Nonlinear and Symbolic Data Dependence Testing , 1998, IEEE Trans. Parallel Distributed Syst..

[16]  Chih-Ping Chu,et al.  The generalized Direction Vector I test , 2001, Parallel Comput..

[17]  Hilla Peretz,et al.  The , 1966 .

[18]  Constantine D. Polychronopoulos,et al.  Advanced Loop Optimizations for Parallel Computers , 1988, ICS.

[19]  Chih-Ping Chu,et al.  A precise dependence analysis for multi-dimensional arrays under specific dependence direction , 2002, J. Syst. Softw..

[20]  Doris L. Carver,et al.  An analysis of recurrence relations in Fortran Do-loops for vector processing , 1991, [1991] Proceedings. The Fifth International Parallel Processing Symposium.

[21]  Weng-Long Chang,et al.  The extension of the I test , 1998, Parallel Comput..

[22]  Chih-Ping Chu,et al.  A multi-dimensional version of the I test , 2001, Parallel Comput..

[23]  Utpal Banerjee,et al.  Loop Transformations for Restructuring Compilers: The Foundations , 1993, Springer US.

[24]  Barbara M. Chapman,et al.  Supercompilers for parallel and vector computers , 1990, ACM Press frontier series.

[25]  Michael Wolfe,et al.  High performance compilers for parallel computing , 1995 .

[26]  Donald E. Knuth,et al.  The Art of Computer Programming, Volume I: Fundamental Algorithms, 2nd Edition , 1997 .

[27]  Utpal Banerjee Loop Parallelization , 1994, Springer US.