Dynamic speculative precomputation

A large number of memory accesses in memory-bound applications are irregular, such as pointer dereferences, and can be effectively targeted by thread-based prefetching techniques like Speculative Precomputation. These techniques execute instructions, for example on an available SMT thread context, that have been extracted directly from the program they are trying to accelerate. Proposed techniques typically require manual user intervention to extract and optimize instruction sequences. This paper proposes Dynamic Speculative Precomputation, which performs all necessary instruction analysis, extraction, and optimization through the use of back-end instruction analysis hardware, located off the processor's critical path. For a set of memory limited benchmarks an average speedup of 14% is achieved when constructing simple p-slices, and this gain grows to 33% when making use of aggressive optimizations.

[1]  Jignesh M. Patel,et al.  Data prefetching by dependence graph precomputation , 2001, Proceedings 28th Annual International Symposium on Computer Architecture.

[2]  Anoop Gupta,et al.  Tolerating Latency Through Software-Controlled Prefetching in Shared-Memory Multiprocessors , 1991, J. Parallel Distributed Comput..

[3]  Martin C. Carlisle,et al.  Olden: parallelizing programs with dynamic data structures on distributed-memory machines , 1996 .

[4]  Todd C. Mowry,et al.  Tolerating latency through software-controlled data prefetching , 1994 .

[5]  Yale N. Patt,et al.  Putting the fill unit to work: dynamic optimizations for trace cache microprocessors , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.

[6]  Brad Calder,et al.  Dynamic prediction of critical path instructions , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.

[7]  Eric Rotenberg,et al.  Slipstream processors: improving both performance and fault tolerance , 2000, SIGP.

[8]  Douglas J. Joseph,et al.  Prefetching Using Markov Predictors , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[9]  Craig Zilles,et al.  Execution-based prediction using speculative slices , 2001, ISCA 2001.

[10]  Jack L. Lo,et al.  Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous Multithreading Processor , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).

[11]  Yale N. Patt,et al.  An effective programmable prefetch engine for on-chip caches , 1995, MICRO 1995.

[12]  Chi-Keung Luk,et al.  Tolerating memory latency through software-controlled pre-execution in simultaneous multithreading processors , 2001, Proceedings 28th Annual International Symposium on Computer Architecture.

[13]  Dionisios N. Pnevmatikatos,et al.  Slice-processors: an implementation of operation-based prediction , 2001, ICS '01.

[14]  Andreas Moshovos,et al.  Dependence based prefetching for linked data structures , 1998, ASPLOS VIII.

[15]  Gurindar S. Sohi,et al.  Speculative data-driven multithreading , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.

[16]  Norman P. Jouppi,et al.  Improving direct-mapped cache performance by the addition of a small fully-associative cache and pre , 1990, ISCA 1990.

[17]  Dean M. Tullsen,et al.  Handling long-latency loads in a simultaneous multithreading processor , 2001, Proceedings. 34th ACM/IEEE International Symposium on Microarchitecture. MICRO-34.

[18]  Jignesh M. Patel,et al.  Data prefetching by dependence graph precomputation , 2001, ISCA 2001.

[19]  John Paul Shen,et al.  Instruction path coprocessors , 2000, ISCA '00.

[20]  Dean M. Tullsen,et al.  Simultaneous multithreading: Maximizing on-chip parallelism , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.

[21]  John Paul Shen,et al.  Speculative precomputation: long-range prefetching of delinquent loads , 2001, Proceedings 28th Annual International Symposium on Computer Architecture.

[22]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.