A quantitative assessment of thread-level speculation techniques

Speculative thread-level parallelism has been recently proposed as an alternative source of parallelism that can boost the performance for applications where independent threads are hard to find. Several schemes to exploit thread level parallelism have been proposed and significant performance gains have been reported. However, the sources of the performance gains are poorly understood as well as the impact of some design choices. In this work, the advantages of different thread speculation techniques are analyzed as are the impact of some critical issues including the value predictor, the branch predictor, the thread initialization overhead and the connectivity among thread units.

[1]  Gurindar S. Sohi,et al.  Speculative Multithreaded Processors , 2001, Computer.

[2]  李幼升,et al.  Ph , 1989 .

[3]  Vojin G. Oklobdzija,et al.  Multithreaded Decoupled Architecture , 1995, Int. J. High Speed Comput..

[4]  James E. Smith,et al.  The performance potential of data dependence speculation and collapsing , 1996, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29.

[5]  Jenn-Yuan Tsai,et al.  The superthreaded architecture: thread pipelining with run-time data dependence checking and control speculation , 1996, Proceedings of the 1996 Conference on Parallel Architectures and Compilation Technique.

[6]  Antonio González,et al.  Clustered speculative multithreaded processors , 1999, ICS '99.

[7]  Eric Rotenberg,et al.  Trace cache: a low latency approach to high bandwidth instruction fetching , 1996, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29.

[8]  Quinn Jacobson,et al.  Trace processors , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[9]  Gurindar S. Sohi,et al.  Compiling for the multiscalar architecture , 1998 .

[10]  Kunle Olukotun,et al.  Data speculation support for a chip multiprocessor , 1998, ASPLOS VIII.

[11]  Antonio González,et al.  Value prediction for speculative multithreaded architectures , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.

[12]  Mikko H. Lipasti,et al.  Value locality and load value prediction , 1996, ASPLOS VII.

[13]  F. Gabbay Speculative Execution based on Value Prediction Research Proposal towards the Degree of Doctor of Sciences , 1996 .

[14]  Todd C. Mowry,et al.  The potential for using thread-level data speculation to facilitate automatic parallelization , 1998, Proceedings 1998 Fourth International Symposium on High-Performance Computer Architecture.

[15]  Kevin O'Brien,et al.  Single-program speculative multithreading (SPSM) architecture: compiler-assisted fine-grained multithreading , 1995, PACT.

[16]  Haitham Akkary,et al.  A dynamic multithreading processor , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.

[17]  José González,et al.  Data value speculation in superscalar processors , 1998, Microprocess. Microsystems.

[18]  James E. Smith,et al.  Complexity-Effective Superscalar Processors , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[19]  James E. Smith,et al.  Implementations of Context Based Value Predictors , 1997 .

[20]  Dean M. Tullsen,et al.  Simultaneous multithreading: Maximizing on-chip parallelism , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.

[21]  Doug Matzke,et al.  Will Physical Scalability Sabotage Performance Gains? , 1997, Computer.

[22]  Monica S. Lam,et al.  In search of speculative thread-level parallelism , 1999, 1999 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00425).

[23]  S. Vajapeyam,et al.  Improving Superscalar Instruction Dispatch And Issue By Exploiting Dynamic Code Sequences , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[24]  Manoj Franklin,et al.  PEWs: a decentralized dynamic scheduler for ILP processing , 1996, Proceedings of the 1996 ICPP Workshop on Challenges for Parallel Processing.

[25]  Gurindar S. Sohi,et al.  Multiscalar processors , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.

[26]  Gurindar S. Sohi,et al.  The Expandable Split Window Paradigm for Exploiting Fine-grain Parallelism , 1992, [1992] Proceedings the 19th Annual International Symposium on Computer Architecture.

[27]  David J. Lilja,et al.  Coarse-grained speculative execution in shared-memory multiprocessors , 1998, ICS '98.

[28]  Josep Torrellas,et al.  Hardware and software support for speculative execution of sequential binaries on a chip-multiprocessor , 1998, ICS '98.

[29]  M. Bohr Interconnect scaling-the real limiter to high performance ULSI , 1995, Proceedings of International Electron Devices Meeting.

[30]  David W. Wall,et al.  Limits of instruction-level parallelism , 1991, ASPLOS IV.