Speculative hedge: regulating compile-time speculation against profile variations

Path-oriented scheduling methods, such os trace scheduling and hyperblock scheduling, use speculation to extract instruction-level parallelism from control-intensive programs. These methods predict important execution paths in the current scheduling scope using execution profiling or frequency estimation. Aggressive speculation is then applied to the important execution paths, possibly at the cost of degraded performance along other paths. Therefore, the speed of the output code can be sensitive to the compiler's ability to accurately predict the important execution paths. Prior work in this area has utilized the speculative yield function by Fisher, coupled with dependence height, to distribute instruction priority among execution paths in the scheduling scope. While this technique provides more stability of performance by paying attention to the needs of all paths, it does not directly address the problem of mismatch between compile-time prediction and run-time behavior. The work presented in this paper extends the speculative yield and dependence height heuristic to explicitly minimize the penalty suffered by other paths when instructions are speculated along a path. Since the execution time of a path is determined by the number of cycles spent between a path's entrance and exit in the scheduling scope, the heuristic attempts to eliminate unnecessary speculation that delays any path's exit. Such control of speculation makes the performance much less sensitive to the actual path taken at run time. The proposed method has a strong emphasis on achieving minimal delay to all exits. Thus the name, speculative hedge, is wed. This paper presents the speculative hedge heuristic, and shows how it controls over-speculation in a superblock/hyperblock scheduler. The stability of output code performance in the presence of execution variation is demonstrated with sit programs from the SPEC CINT92 benchmark suite.

[1]  C. V. Ramamoorthy,et al.  A High-Level Language for Horizontal Microprogramming , 1974, IEEE Transactions on Computers.

[2]  Todd M. Austin,et al.  Zero-cycle loads: microarchitecture support for reducing load latency , 1995, Proceedings of the 28th Annual International Symposium on Microarchitecture.

[3]  Joseph A. Fisher,et al.  Trace Scheduling: A Technique for Global Microcode Compaction , 1981, IEEE Transactions on Computers.

[4]  Thomas M. Conte,et al.  Hardware-Based Pro ling: An E ective Technique for Pro le-Driven Optimization , 1996 .

[5]  Scott A. Mahlke,et al.  IMPACT: An Architectural Framework for Multiple-Instruction-Issue Processors , 1998, 25 Years ISCA: Retrospectives and Reprints.

[6]  Scott A. Mahlke,et al.  Superblock formation using static program analysis , 1993, Proceedings of the 26th Annual International Symposium on Microarchitecture.

[7]  Walter H. Kohler,et al.  A Preliminary Evaluation of the Critical Path Method for Scheduling Tasks on Multiprocessor Systems , 1975, IEEE Transactions on Computers.

[8]  Scott A. Mahlke,et al.  Effective compiler support for predicated execution using the hyperblock , 1992, MICRO 25.

[9]  Bruce D. Shriver,et al.  Some Experiments in Local Microcode Compaction for Horizontal Machines , 1981, IEEE Transactions on Computers.

[10]  Vinod Kathail,et al.  Critical path reduction for scalar programs , 1995, Proceedings of the 28th Annual International Symposium on Microarchitecture.

[11]  David J. Lilja,et al.  The Interaction of Compilation Technology and Computer Architecture , 1994, Springer US.

[12]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[13]  Scott A. Mahlke,et al.  Superblock formation using static program analysis , 1993, MICRO.

[14]  Michael D. Smith,et al.  Architectural Support for Compile-Time Speculation , 1994 .

[15]  Rajeev Motwani,et al.  Profile-driven instruction level parallel scheduling with application to super blocks , 1996, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29.

[16]  B. Ramakrishna Rau,et al.  Iterative modulo scheduling: an algorithm for software pipelining loops , 1994, MICRO 27.

[17]  Gang Chen,et al.  GPMB—software pipelining branch-intensive loops , 1993, MICRO 1993.