Disjoint eager execution: an optimal form of speculative execution

Instruction Level Parallelism (ILP) speedups of an order-of-magnitude or greater may be possible using the techniques described herein. Traditional speculative code execution is the execution of code down one path of a branch (branch prediction) or both paths of a branch (eager execution), before the condition of the branch has been evaluated, thereby executing code ahead of time, and improving performance. A third, optimal, method of speculative execution, Disjoint Eager Execution (DEE), is described herein. A restricted form of DEE, easier to implement than pure DEE, is developed and evaluated. An implementation of both DEE and minimal control dependencies is described. DEE is shown both theoretically and experimentally to yield more parallelism than both branch prediction and eager execution when the same, finite, execution resources are assumed. ILP speedups of factors in the ten's are demonstrated with constrained resources.

[1]  Robert M. Keller,et al.  Look-Ahead Processors , 1975, CSUR.

[2]  Andrew R. Pleszkun,et al.  Implementing Precise Interrupts in Pipelined Processors , 1988, IEEE Trans. Computers.

[3]  Bantwal R. Rau Dynamically scheduled VLIW processors , 1993, MICRO 1993.

[4]  Guang R. Gao,et al.  Designing the McCAT Compiler Based on a Family of Structured Intermediate Representations , 1992, LCPC.

[5]  Augustus K. Uht,et al.  Data path issues in a highly concurrent machine (abstract) , 1992, ISCA '92.

[6]  Augustus K. Uht,et al.  Extraction of massive instruction level parallelism , 1993, CARN.

[7]  Joe D. Warren,et al.  The program dependence graph and its use in optimization , 1987, TOPL.

[8]  Monica S. Lam,et al.  Limits of control flow on parallelism , 1992, ISCA '92.

[9]  Andrew R. Pleszkun,et al.  WISQ: a restartable architecture using queues , 1987, ISCA '87.

[10]  Paolo Faraboschi,et al.  An analysis of dynamic scheduling techniques for symbolic applications , 1993, Proceedings of the 26th Annual International Symposium on Microarchitecture.

[11]  Yale N. Patt,et al.  A comparison of dynamic branch predictors that use two levels of branch history , 1993, ISCA '93.

[12]  Edward M. Riseman,et al.  The Inhibition of Potential Parallelism by Conditional Jumps , 1972, IEEE Transactions on Computers.

[13]  Augustus K. Uht,et al.  A Theory of Reduced and Minimal Procedural Dependencies , 1991, IEEE Trans. Computers.

[14]  Henk Corporaal,et al.  Register file port requirements of transport triggered architectures , 1994, Proceedings of MICRO-27. The 27th Annual IEEE/ACM International Symposium on Microarchitecture.

[15]  Robert G. Wedig Detection of concurrency in directly executed language instruction streams , 1982 .

[16]  Augustus K. Uht,et al.  Concurrency Extraction via Hardware Methods Executing the Static Instruction Stream , 1992, IEEE Trans. Computers.

[17]  S. ShouHan Wang,et al.  Ideograph/ideogram: framework/hardware for eager evaluation , 1990, [1990] Proceedings of the 23rd Annual Workshop and Symposium@m_MICRO 23: Microprogramming and Microarchitecture.

[18]  John Cocke,et al.  A methodology for the real world , 1981 .

[19]  Augustus K. Uht,et al.  Data path issues in a highly concurrent machine , 1992, MICRO 1992.

[20]  Robert P. Colwell,et al.  A VLIW architecture for a trace scheduling compiler , 1987, ASPLOS 1987.

[21]  Yale N. Patt,et al.  Checkpoint repair for out-of-order execution machines , 1987, ISCA '87.