论文信息 - Exploiting Large Ineffectual Instruction Sequences

Exploiting Large Ineffectual Instruction Sequences

A processor executes the full dynamic instruction stream in order to compute the final output of a program, yet we observe equivalent, smaller instruction streams that produce the same correct output. Based on this observation, we attempt to identify large, dynamically-contiguous regions of instructions that are ineffectual as a whole: they either contain no writes, writes that are never referenced, or writes that do not modify the value of a location. The architectural implication is that instruction fetch/execution can quickly bypass predicted-ineffectual regions, while another thread of control verifies that the implied branch predictions in the region are correct and that the region is truly ineffectual.

Eric Rotenberg | E. Rotenberg

[1] M. F.,et al. Bibliography , 1985, Experimental Gerontology.

[2] Jian Huang,et al. Exploiting basic block value locality with block reuse , 1999, Proceedings Fifth International Symposium on High-Performance Computer Architecture.

[3] James E. Smith,et al. Path-based next trace prediction , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[4] Gurindar S. Sohi,et al. An empirical analysis of instruction repetition , 1998, ASPLOS VIII.

[5] James E. Smith,et al. Modeling program predictability , 1998, ISCA.

[6] Antonio González,et al. Trace-level reuse , 1999, Proceedings of the 1999 International Conference on Parallel Processing.

[7] G.S. Sohi,et al. Dynamic instruction reuse , 1997, ISCA '97.

[8] Yale N. Patt,et al. Target prediction for indirect jumps , 1997, ISCA '97.

[9] Jack L. Lo,et al. Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous Multithreading Processor , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).

[10] Eric Rotenberg,et al. A Trace Cache Microarchitecture and Evaluation , 1999, IEEE Trans. Computers.

[11] Daniel H. Friendly,et al. Evaluation of Design Options for the Trace Cache Fetch Mechanism , 1999, IEEE Trans. Computers.

[12] Doug Burger,et al. Evaluating Future Microprocessors: the SimpleScalar Tool Set , 1996 .

[13] Mikko H. Lipasti,et al. Value locality and load value prediction , 1996, ASPLOS VII.

[14] Eric Rotenberg,et al. AR-SMT: a microarchitectural approach to fault tolerance in microprocessors , 1999, Digest of Papers. Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing (Cat. No.99CB36352).

[15] Kunle Olukotun,et al. The case for a single-chip multiprocessor , 1996, ASPLOS VII.

[16] S. McFarling. Combining Branch Predictors , 1993 .

[17] Mikko H. Lipasti. Value locality and speculative execution , 1998 .

[18] D.R. Kaeli,et al. Branch history table prediction of moving target branches due to subroutine returns , 1991, [1991] Proceedings. The 18th Annual International Symposium on Computer Architecture.