论文信息 - Performance Potential of Effective Address Prediction of Load Instructions

Performance Potential of Effective Address Prediction of Load Instructions

Modern, deeply pipelined, out-of-order, and speculative microprocessors are still plagued by the latency of load instructions. This latency is dominated by the latencies to resolve the source operands of the load, to compute its effective address, and to fetch the load’s data from caches or the main memory. This chapter examines the performance potential of hiding a load’s data fetch latency using effective address prediction. By predicting the effective address of a load early in the pipeline, we can initiate the cache access early, thereby improving performance.

[1] Stamatis Vassiliadis,et al. A load-instruction unit for pipelined processors , 1993, IBM J. Res. Dev..

[2] John Paul Shen,et al. An integrated functional performance simulator , 1999, IEEE Micro.

[3] Andreas Moshovos,et al. Dependence based prefetching for linked data structures , 1998, ASPLOS VIII.

[4] MoshovosAndreas,et al. Dependence based prefetching for linked data structures , 1998 .

[5] Todd M. Austin,et al. Zero-cycle loads: microarchitecture support for reducing load latency , 1995, Proceedings of the 28th Annual International Symposium on Microarchitecture.

[6] Mikko H. Lipasti,et al. Value locality and load value prediction , 1996, ASPLOS VII.

[7] Stéphan Jourdan,et al. Early load address resolution via register tracking , 2000, ISCA '00.

[8] James E. Smith,et al. The predictability of data values , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[9] José González,et al. Speculative execution via address prediction and data prefetching , 1997, ICS '97.

[10] Anne Rogers,et al. Software caching and computation migration in Olden , 1995, PPOPP '95.

[11] John Paul Shen,et al. Load execution latency reduction , 1998, ICS '98.

[12] Uri C. Weiser,et al. Correlated load-address predictors , 1999, ISCA.

[13] Andreas Moshovos,et al. Dynamic Speculation and Synchronization of Data Dependences , 1997, ISCA.

[14] Vicki H. Allan,et al. Petri net versus module scheduling for software pipelining , 1995, MICRO 1995.

[15] Jean-Loup Baer,et al. Effective Hardware Based Data Prefetching for High-Performance Processors , 1995, IEEE Trans. Computers.

[16] Joel S. Emer,et al. Memory dependence prediction using store sets , 1998, Proceedings. 25th Annual International Symposium on Computer Architecture (Cat. No.98CB36235).

[17] Glenn Reinman,et al. Predictive techniques for aggressive load speculation , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.