Improved spill code generation for software pipelined loops

Software pipelining is a loop scheduling technique that extractsparallelism out of loops by overlapping the execution of severalconsecutive iterations. Due to the overlapping of iterations, schedules impose high register requirements during their execution. A schedule is valid if it requires at most the number of registers available in the target architecture. If not, its register requirementshave to be reduced either by decreasing the iteration overlapping or by spilling registers to memory. In this paper we describe a set of heuristics to increase the quality of register-constrained modulo schedules. The heuristics decide between the two previous alternatives and define criteria for effectively selecting spilling candidates. The heuristics proposed for reducing the register pressure can be applied to any software pipelining technique. The proposals are evaluated using a register-conscious software pipeliner on a workbench composed of a large set of loops from the Perfect Club benchmark and a set of processor configurations. Proposals in this paper are compared against a previous proposal already described in the literature. For one of these processor configurations and the set of loops that do not fit in the available registers (32), a speed-up of 1.68 and a reduction of the memory traffic by a factor of 0.57 are achieved with an affordable increase in compilation time. For all the loops, this represents a speed-up of 1.38 and a reduction of the memory traffic by a factor of 0.7.

[1]  Alan E. Charlesworth,et al.  An Approach to Scientific Array Processing: The Architectural Design of the AP-120B/FPS-164 Family , 1981, Computer.

[2]  Vicki H. Allan,et al.  Software pipelining , 1995, CSUR.

[3]  Alexandru Nicolau,et al.  Advances in languages and compilers for parallel processing , 1991 .

[4]  Christine Eisenbeis,et al.  The meeting graph: a new model for loop cyclic register allocation , 1995, PACT.

[5]  B. Ramakrishna Rau,et al.  Register allocation for software pipelined loops , 1992, PLDI '92.

[6]  A. Gonzalez,et al.  Hypernode reduction modulo scheduling , 1995, Proceedings of the 28th Annual International Symposium on Microarchitecture.

[7]  Gregory J. Chaitin,et al.  Register allocation and spilling via graph coloring , 2004, SIGP.

[8]  David Callahan,et al.  Register allocation via hierarchical graph coloring , 1991, PLDI '91.

[9]  Richard A. Huff,et al.  Lifetime-sensitive modulo scheduling , 1993, PLDI '93.

[10]  Guang R. Gao,et al.  Software pipelining showdown: optimal vs. heuristic methods in a production compiler , 1996, PLDI '96.

[11]  Josep Llosa,et al.  Heuristics for register-constrained software pipelining , 1996, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29.

[12]  Jian Wang,et al.  Software pipelining with register allocation and spilling , 1994, MICRO 27.

[13]  Alexandre E. Eichenberger,et al.  Stage scheduling: a technique to reduce the register requirements of a module schedule , 1995, MICRO 1995.

[14]  Keith D. Cooper,et al.  Improvements to graph coloring register allocation , 1994, TOPL.

[15]  Ken Kennedy,et al.  Conversion of control dependence to data dependence , 1983, POPL '83.

[16]  B. Ramakrishna Rau,et al.  Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing , 1981, MICRO 14.

[17]  Geoffrey C. Fox,et al.  The Perfect Club Benchmarks: Effective Performance Evaluation of Supercomputers , 1989, Int. J. High Perform. Comput. Appl..

[18]  Alexandre E. Eichenberger,et al.  Stage scheduling: a technique to reduce the register requirements of a modulo schedule , 1995, Proceedings of the 28th Annual International Symposium on Microarchitecture.

[19]  Peter Y.-T. Hsu,et al.  Overlapped loop support in the Cydra 5 , 1989, ASPLOS III.

[20]  Monica Sin-Ling Lam,et al.  A Systolic Array Optimizing Compiler , 1989 .

[21]  Edward S. Davidson,et al.  Register requirements of pipelined processors , 1992, ICS '92.

[22]  Suneel Jain,et al.  Circular scheduling: a new technique to perform software pipelining , 1991, PLDI '91.

[23]  Ron Y. Pinter,et al.  Spill code minimization techniques for optimizing compliers , 1989, PLDI '89.

[24]  Guang R. Gao,et al.  Register allocation using cyclic interval graphs: a new approach to an old problem , 1992 .

[25]  B. Ramakrishna Rau,et al.  Iterative modulo scheduling: an algorithm for software pipelining loops , 1994, MICRO 27.