A finite state machine based format model of software pipelined loops with conditions

This paper addresses the problem of parallelizing loops with conditional branches, in the context of software pipelining. A new formal approach to this problem is proposed, called the predicated Software Pipelining (PSP) model. The PSP model represents execution of a loop with conditional branches as trasitions of a finite state machine Each node of the state machine is composed of operations of one parallelized loop iteration. The rules of operation movements between nodes in the PSP model and discribed. The model represents a new theoritical framework for further investigation of inherent properties of these loops, as well as a basis for novel scheduling techniques.

[1]  Uwe Schwiegelshohn,et al.  On Optimal Parallelization of Arbitrary Loops , 1991, J. Parallel Distributed Comput..

[2]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[3]  Jeffrey D. Ullman,et al.  Formal languages and their relation to automata , 1969, Addison-Wesley series in computer science and information processing.

[4]  Nancy Warter-Perez,et al.  Modulo scheduling with multiple initiation intervals , 1995, MICRO 1995.

[5]  Gang Chen,et al.  GPMB—software pipelining branch-intensive loops , 1993, MICRO 1993.

[6]  Soo-Mook Moon,et al.  Split-path enhanced pipeline scheduling for loops with control flows , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.

[7]  Kemal Ebcioglu,et al.  A compilation technique for software pipelining of loops with conditional jumps , 1987, MICRO 20.

[8]  B. Ramakrishna Rau,et al.  The Cydra 5 departmental supercomputer: design philosophies, decisions, and trade-offs , 1989, Computer.

[9]  David F. Bacon,et al.  Compiler transformations for high-performance computing , 1994, CSUR.

[10]  B. Ramakrishna Rau,et al.  Iterative modulo scheduling: an algorithm for software pipelining loops , 1994, MICRO 27.

[11]  B. R. Rau,et al.  Code generation schema for modulo scheduled loops , 1992, MICRO 1992.

[12]  Alexander Aiken,et al.  Perfect Pipelining: A New Loop Parallelization Technique , 1988, ESOP.

[13]  Jian Wang,et al.  GURPR*: a new global software pipelining algorithm , 1991, MICRO 24.

[14]  Scott A. Mahlke,et al.  Reverse If-Conversion , 1993, PLDI '93.

[15]  Kemal Ebcioglu,et al.  An efficient resource-constrained global scheduling technique for superscalar and VLIW processors , 1992, MICRO 1992.

[16]  Corinna G. Lee,et al.  Software pipelining loops with conditional branches , 1996, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29.

[17]  Monica S. Lam,et al.  Limits of control flow on parallelism , 1992, ISCA '92.

[18]  Alexandru Nicolau,et al.  Percolation Scheduling: A Parallel Compilation Technique , 1985 .

[19]  Alexander Aiken,et al.  Optimal loop parallelization , 1988, PLDI '88.

[20]  Zoran Jovanovic,et al.  A formal model of software pipelining loops with conditions , 1997, Proceedings 11th International Parallel Processing Symposium.

[21]  Toshio Nakatani,et al.  A new compilation technique for parallelizing loops with unpredictable branches on a VLIW architecture , 1990 .

[22]  Ken Kennedy,et al.  Conversion of control dependence to data dependence , 1983, POPL '83.

[23]  Gyungho Lee Parallelizing Iterative Loops with Conditional Branching , 1995, IEEE Trans. Parallel Distributed Syst..

[24]  Joe D. Warren,et al.  The program dependence graph and its use in optimization , 1987, TOPL.

[25]  Augustus K. Uht Requirements for Optimal Execution of Loops with Tests , 1992, IEEE Trans. Parallel Distributed Syst..

[26]  Krishna Subramanian,et al.  Enhanced modulo scheduling for loops with conditional branches , 1992, MICRO 1992.

[27]  Zoran Jovanovic,et al.  Predicated software pipelining technique for loops with conditions , 1998, Proceedings of the First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing.