论文信息 - Software pipelining: an effective scheduling technique for VLIW machines

Software pipelining: an effective scheduling technique for VLIW machines

This paper shows that software pipelining is an effective and viable scheduling technique for VLIW processors. In software pipelining, iterations of a loop in the source program are continuously initiated at constant intervals, before the preceding iterations complete. The advantage of software pipelining is that optimal performance can be achieved with compact object code. This paper extends previous results of software pipelining in two ways: First, this paper shows that by using an improved algorithm, near-optimal performance can be obtained without specialized hardware. Second, we propose a hierarchical reduction scheme whereby entire control constructs are reduced to an object similar to an operation in a basic block. With this scheme, all innermost loops, including those containing conditional statements, can be software pipelined. It also diminishes the start-up cost of loops with small number of iterations. Hierarchical reduction complements the software pipelining technique, permitting a consistent performance improvement be obtained. The techniques proposed have been validated by an implementation of a compiler for Warp, a systolic array consisting of 10 VLIW processors. This compiler has been used for developing a large number of applications in the areas of image, signal and scientific processing.

Monica Lam

[1] Joseph Allen Fisher,et al. The Optimization of Horizontal Microcode within and Beyond Basic Blocks: an Application of Processor Scheduling with Resources , 2018 .

[2] Monica Sin-Ling Lam,et al. A Systolic Array Optimizing Compiler , 1989 .

[3] Alexander Aiken,et al. Perfect Pipelining: A New Loop Parallelization Technique , 1988, ESOP.

[4] H. T. Kung,et al. The Warp Computer: Architecture, Implementation, and Performance , 1987, IEEE Transactions on Computers.

[5] Jian Wang,et al. GURPR—a method for global software pipelining , 1987, MICRO 20.

[6] Kemal Ebcioglu,et al. A compilation technique for software pipelining of loops with conditional jumps , 1987, MICRO 20.

[7] James E. Smith,et al. A study of scalar compilation techniques for pipelined supercomputers , 1987, ASPLOS.

[8] Robert P. Colwell,et al. A VLIW architecture for a trace scheduling compiler , 1987, ASPLOS.

[9] Bogong Su,et al. URPR—An extension of URCR for software pipelining , 1986, MICRO 19.

[10] Thomas R. Gross,et al. Compilation for a high-performance systolic array , 1986, SIGPLAN '86.

[11] Peter Y.-T. Hsu,et al. Highly concurrent scalar processing , 1986, ISCA '86.