The mechanical parallelization of loop nests containing while loops

Acknowledgments Nobody can write a thesis without help form others, and it is usually impossible to express one's gratitude for this immense amount of help. The least I can do is to devote the rst pages of my thesis to all these wonderful people, and thank them all for their precious support and individual help. I want to mention some people explicitly, even knowing that my list must be inclomplete. First of all, I want to thank Professor Christian Lengauer who has been an excellent advisor to me. Thank you for my position, for your liberality concerning working modes, for uncountably many fruitful discussions with you (oocial and private), for your indefatigability in improving my English, for multiple detailed proof readings of this thesis, for always having time for my problems, ...; short, thank you for having been a real \Doktorvater", which, to me, is more than an advisor. In addition, I am grateful to Professor Paul Feautrier: thank you that you have accepted to review this thesis, and took the time to give me detailed comments on my draft of this thesis. I also want to thank Professor P. Kleinschmidt, Professor F.-J. Brandenburg and Professor W. Hahn for having agreed to being my examiners and for their helpfulness. Furthermore, I would like to thank Professor N. Schwartz who always helps at the formal aspects on the way to a Ph. D. However, there are also helpful persons outside of my dissertation committee. First of all, I want to thank my French colleague and friend Jean-Frann cois Collard. Thank you for your cooperation already at the beginning of this thesis, when we did not yet know each other. Because of your open-minded way, we succeeded in working together instead of being competitors. This led to many fruitful discussions and a deep friendship. Thanks a lot for that. Furthermore, I would like to thank the members of the Lehrstuhl f ur Programmierung for the good working climate and for some helpful hints, and esp. Christoph Herrmann for his excellent proof reading. In addition, there is another member of the group I want to mention speciically: Ulrike Lechner. I would call her \the good soul of our group". Thank you for sharing the ooce and some work, and for the wonderful climate in our ooce, not only due to your owers. A-pro-pos climate: one of the most agreeable teams I have ever …

[1]  H. A. Partsch Some Experiments in Transforming Towards Parallel Executability , 1993 .

[2]  Michael Wolfe,et al.  The Tiny Loop Restructuring Research Tool , 1991, ICPP.

[3]  Patrice Quinton,et al.  The systematic design of systolic arrays , 1987 .

[4]  Richard M. Karp,et al.  The Organization of Computations for Uniform Recurrence Equations , 1967, JACM.

[5]  Jingling Xue Automating Non-Unimodular Loop Transformations for Massive Parallelism , 1994, Parallel Comput..

[6]  Ted G. Lewis,et al.  Parallelizing WHILE Loops , 1990, ICPP.

[7]  Corinne Ancourt,et al.  Scanning polyhedra with DO loops , 1991, PPOPP '91.

[8]  Alain J. Martin The Probe: An Addition to Communication Primitives , 1985, Inf. Process. Lett..

[9]  Christian Lengauer,et al.  Unimodularity and the Prallelization of Loops , 1992, Parallel Process. Lett..

[10]  Paul Feautrier Toward Automatic Distribution , 1994, Parallel Process. Lett..

[11]  William Pugh,et al.  Simplifying Polynominal Constraints Over Integers to Make Dependence Analysis More Precise , 1994, CONPAR.

[12]  J.-F. Collard Space-time transformation of while-loops using speculative execution , 1994, Proceedings of IEEE Scalable High Performance Computing Conference.

[13]  William Pugh,et al.  Eliminating false data dependences using the Omega test , 1992, PLDI '92.

[14]  William Pugh,et al.  Going Beyond Integer Programming with the Omega Test to Eliminate False Data Dependences , 1995, IEEE Trans. Parallel Distributed Syst..

[15]  Yves Robert,et al.  Mapping Uniform Loop Nests Onto Distributed Memory Architectures , 1993, Parallel Comput..

[16]  Collard,et al.  Fuzzy Array Dataaow Analysis , 1995 .

[17]  Jang-Ping Sheu,et al.  Partitioning and Mapping Nested Loops on Multiprocessor Systems , 1991, IEEE Trans. Parallel Distributed Syst..

[18]  P. Feautrier Parametric integer programming , 1988 .

[19]  David K. Smith Theory of Linear and Integer Programming , 1987 .

[20]  William Pugh,et al.  A practical algorithm for exact array dependence analysis , 1992, CACM.

[21]  FeautrierLaboratoire Masi Some Eecient Solutions to the Aane Scheduling Problem Part Ii Multidimensional Time , 1992 .

[22]  Thomas Kailath,et al.  Regular iterative algorithms and their implementation on processor arrays , 1988, Proc. IEEE.

[23]  Christian Lengauer,et al.  Loop Parallelization in the Polytope Model , 1993, CONCUR.

[24]  Jean-Francois Collard Code Generation in Automatic Parallelizers , 1994, Applications in Parallel and Distributed Computing.

[25]  W. Kelly,et al.  Code generation for multiple mappings , 1995, Proceedings Frontiers '95. The Fifth Symposium on the Frontiers of Massively Parallel Computation.

[26]  Arthur J. Bernstein,et al.  Analysis of Programs for Parallel Processing , 1966, IEEE Trans. Electron. Comput..

[27]  P. Feautrier Some Eecient Solutions to the Aane Scheduling Problem Part Ii Multidimensional Time , 1992 .

[28]  Patrice Quinton,et al.  The mapping of linear recurrence equations on regular arrays , 1989, J. VLSI Signal Process..

[29]  Alain Darte Regular partitioning for synthesizing fixed-size systolic arrays , 1991, Integr..

[30]  Frédéric Vivien,et al.  Optimal Fine and Medium Grain Parallelism Detection in Polyhedral Reduced Dependence Graphs , 1996, Proceedings of the 1996 Conference on Parallel Architectures and Compilation Technique.

[31]  Utpal Banerjee,et al.  Loop Transformations for Restructuring Compilers: The Foundations , 1993, Springer US.

[32]  Martin Griebl,et al.  Generation of Synchronous Code for Automatic Parallelization of while Loops , 1995, Euro-Par.

[33]  Yves Robert,et al.  Mapping affine loop nests: new results , 1995, HPCN Europe.

[34]  Edsger W. Dijkstra,et al.  Predicate Calculus and Program Semantics , 1989, Texts and Monographs in Computer Science.

[35]  Leslie Lamport,et al.  The parallel execution of DO loops , 1974, CACM.

[36]  Laurence A. Wolsey,et al.  Integer and Combinatorial Optimization , 1988 .

[37]  Wayne R. Dyksen,et al.  Pipelined iterative methods for shared memory machines , 1989, Parallel Comput..

[38]  Jürgen Teich,et al.  Partitioning of processor arrays: a piecewise regular approach , 1993, Integr..

[39]  Ian S. Graham,et al.  The transputer handbook , 1990 .

[40]  A. Darte Aane-by-statement Scheduling of Uniform and Aane Loop Nests over Parametric Domains , 1995 .

[41]  Paul Feautrier,et al.  A Method for Static Scheduling of Dynamic Control Programs Preliminary Version , 1994 .

[42]  Alain Darte,et al.  Automatic Parallelization Based on Multi-Dimensional Scheduling , 1994 .

[43]  Yves Robert,et al.  Constructive Methods for Scheduling Uniform Loop Nests , 1994, IEEE Trans. Parallel Distributed Syst..

[44]  Volker Weispfenning,et al.  Parametric linear and quadratic optimization by elimina-tion , 1994 .

[45]  C. Mongenet,et al.  Calculus of space-optimal mappings of systolic algorithms on processor arrays , 1990, [1990] Proceedings of the International Conference on Application Specific Array Processors.