论文信息 - Meld scheduling: relaxing scheduling constraints across region boundaries

Meld scheduling: relaxing scheduling constraints across region boundaries

Meld scheduling melds the schedules of neighboring scheduling regions to respect latencies of operations issued in one region but completing after control transfers to the other. In contrast, conventional schedulers ignore latency constraints from other regions leading to potentially avoidable stalls in an interlocked (superscalar) machine or incorrect schedules for non-interlocked (VLIW) machines. Alternatively, schedulers that conservatively require all operations to complete before the branch rakes effect produce inefficient schedules. In this paper, we present general data structures for maintaining latency constraint information at region boundaries. We present a meld scheduling algorithm that generates latency constraints at the boundaries of scheduled regions and utilizes this information during the scheduling of other regions. We present a range of design options and describe the reasons behind our particular choices. We cover certain pitfalls and discuss how to develop an algorithm that addresses these issues. We evaluate the performance of meld scheduling on a range of noninterlocked machine models on a set of SPEC 92 and Unix benchmarks. We also investigate the sensitivity of the performance improvements due to changes in issue width and instruction latencies.

[1] B. Ramakrishna Rau,et al. Iterative modulo scheduling: an algorithm for software pipelining loops , 1994, MICRO 27.

[2] John Beidler,et al. Data Structures and Algorithms , 1996, Wiley Encyclopedia of Computer Science and Engineering.

[3] Joseph A. Fisher,et al. Trace Scheduling: A Technique for Global Microcode Compaction , 1981, IEEE Transactions on Computers.

[4] Kemal Ebcioglu,et al. An efficient resource-constrained global scheduling technique for superscalar and VLIW processors , 1992, MICRO 1992.

[5] John R. Ellis,et al. Bulldog: A Compiler for VLIW Architectures , 1986 .

[6] Wen-mei W. Hwu,et al. Unrolling-based optimizations for modulo scheduling , 1995, MICRO 1995.

[7] Scott A. Mahlke,et al. Effective compiler support for predicated execution using the hyperblock , 1992, MICRO 25.

[8] Michael Rodeh,et al. Global instruction scheduling for superscalar machines , 1991, PLDI '91.

[9] B. Ramakrishna Rau,et al. Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing , 1981, MICRO 14.

[10] Alexandru Nicolau,et al. Percolation Scheduling: A Parallel Compilation Technique , 1985 .