Looking at history to filter allocations in prediction tables

Dependencies between instructions impose an execution order that must be preserved to guarantee the semantic correctness of programs. Recent works propose the use of prediction techniques to speculatively execute dependent operations, showing a significant increment in IPC. We propose a mechanism that reduces the area cost of a typical address predictor: the last-address predictor. Our proposal classifies load instructions at run-time and records the classifications in a table with more entries than the prediction table. Moreover, it uses this information to initialize its confidence information and to filter the allocation of the load instructions in the prediction table. Using direct mapped tables, our proposal captures a similar predictability and increases the accuracy of the typical address predictor and represents around a 40% area-cost saving.

[1]  Enric Morancho,et al.  Split last-address predictor , 1998, Proceedings. 1998 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.98EX192).

[2]  S. McFarling Combining Branch Predictors , 1993 .

[3]  Mikko H. Lipasti,et al.  Exceeding the dataflow limit via value prediction , 1996, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29.

[4]  Scott McFarling,et al.  Cache Replacement with Dynamic Exclusion , 1992, [1992] Proceedings the 19th Annual International Symposium on Computer Architecture.

[5]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[6]  Mikko H. Lipasti,et al.  Value locality and load value prediction , 1996, ASPLOS VII.

[7]  José González,et al.  Speculative execution via address prediction and data prefetching , 1997, ICS '97.

[8]  Trevor N. Mudge,et al.  The YAGS branch prediction scheme , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.

[9]  John Paul Shen,et al.  Efficacy and performance impact of value prediction , 1998, Proceedings. 1998 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.98EX192).

[10]  R. Ronen,et al.  Correlated load-address predictors , 1999, Proceedings of the 26th International Symposium on Computer Architecture (Cat. No.99CB36367).

[11]  T. Juan,et al.  Dynamic history-length fitting: a third level of adaptivity for branch prediction , 1998, Proceedings. 25th Annual International Symposium on Computer Architecture (Cat. No.98CB36235).

[12]  James E. Smith,et al.  The predictability of data values , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[13]  Karel Driesen,et al.  The cascaded predictor: economical and adaptive branch target prediction , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.

[14]  John Paul Shen,et al.  Load execution latency reduction , 1998, ICS '98.