On the Optimality of Allen and Kennedy's Algorithm for Parallel Extraction in Nested Loops

We explore the link between dependence abstractions and maximal parallelism extraction in nested loops. Our goal is to find, for each dependence abstraction, the minimal transformations needed for maximal parallelism extraction. The result of this paper is that Allen and Kennedy's algorithm is optimal when dependences are approximated by dependence levels. This means that even the most sophisticated algorithm cannot detect more parallelism than found by Allen and Kennedy's algorithm, as long as dependence level is the only information available.

[1]  Utpal Banerjee,et al.  Dependence analysis for supercomputing , 1988, The Kluwer international series in engineering and computer science.

[2]  William Pugh,et al.  The Omega test: A fast and practical integer programming algorithm for dependence analysis , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).

[3]  Frédéric Vivien,et al.  Optimal Fine and Medium Grain Parallelism Detection in Polyhedral Reduced Dependence Graphs , 1996, Proceedings of the 1996 Conference on Parallel Architectures and Compilation Technique.

[4]  A. Darte,et al.  A classification of nested loops parallelization algorithms , 1995, Proceedings 1995 INRIA/IEEE Symposium on Emerging Technologies and Factory Automation. ETFA'95.

[5]  Monica S. Lam,et al.  A Loop Transformation Theory and an Algorithm to Maximize Parallelism , 1991, IEEE Trans. Parallel Distributed Syst..

[6]  Corinne Ancourt,et al.  Minimal Data Dependence Abstractions for Loop Transformations , 1994, LCPC.

[7]  Y. Yang,et al.  Tests des dependances et transformations de programme , 1993 .

[8]  Pierre Jouvelot,et al.  Semantical interprocedural parallelization: an overview of the PIPS project , 1991 .

[9]  Richard M. Karp,et al.  The Organization of Computations for Uniform Recurrence Equations , 1967, JACM.

[10]  P. Feautrier Some Eecient Solutions to the Aane Scheduling Problem Part Ii Multidimensional Time , 1992 .

[11]  Ii C. D. Callahan A global approach to detection of parallelism , 1987 .

[12]  Kleanthis Psarris,et al.  The I Test: A New Test for Subscript Data Dependence , 1990, ICPP.

[13]  Ken Kennedy,et al.  Practical dependence testing , 1991, PLDI '91.

[14]  Leslie Lamport,et al.  The parallel execution of DO loops , 1974, CACM.

[15]  Barbara M. Chapman,et al.  Supercompilers for parallel and vector computers , 1990, ACM Press frontier series.

[16]  William Pugh,et al.  A practical algorithm for exact array dependence analysis , 1992, CACM.

[17]  Ken Kennedy,et al.  Automatic translation of FORTRAN programs to vector form , 1987, TOPL.

[18]  Zhiyuan Li,et al.  Data dependence analysis on multi-dimensional array references , 1989, ICS '89.

[19]  Frédéric Vivien,et al.  On the Optimality of Allen and Kennedy's Algorithm for Parallelism Extraction in Nested Loops , 1996, Parallel Algorithms Appl..

[20]  Yoichi Muraoka,et al.  Parallelism exposure and exploitation in programs , 1971 .