Toward a more accurate understanding of the limits of the TLS execution paradigm

Thread-Level Speculation (TLS) facilitates the extraction of parallel threads from sequential applications. Most prior work has focused on developing the compiler and architecture for this execution paradigm. Such studies often narrowly concentrated on a specific design point. On the other hand, other studies have attempted to assess how well TLS performs if some architectural/ compiler constraint is relaxed. Unfortunately, such previous studies have failed to truly assess TLS performance potential, because they have been bound to some specific TLS architecture and have ignored one or another important TLS design choice, such as support for out-of-order task spawn or support for intermediate checkpointing.

[1]  Kunle Olukotun,et al.  Exploiting method-level parallelism in single-threaded Java programs , 1998, Proceedings. 1998 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.98EX192).

[2]  Josep Torrellas,et al.  Hardware and software support for speculative execution of sequential binaries on a chip-multiprocessor , 1998, ICS '98.

[3]  Marc Tremblay,et al.  Simultaneous speculative threading: a novel pipeline architecture implemented in sun's rock processor , 2009, ISCA '09.

[4]  Gurindar S. Sohi,et al.  Task selection for a multiscalar processor , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.

[5]  Kunle Olukotun,et al.  Exposing speculative thread parallelism in SPEC2000 , 2005, PPoPP.

[6]  Antonia Zhai,et al.  Exploring speculative parallelism in SPEC2006 , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.

[7]  Milind Girkar,et al.  Tight analysis of the performance potential of thread speculation using spec CPU 2006 , 2007, PPOPP.

[8]  Wayne H. Wolf,et al.  MediaBench II video: Expediting the next generation of video systems research , 2009, Microprocess. Microsystems.

[9]  Dean M. Tullsen,et al.  Mitosis compiler: an infrastructure for speculative threading based on pre-computation slices , 2005, PLDI '05.

[10]  G.S. Sohi,et al.  Dynamic Speculation And Synchronization Of Data Dependence , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[11]  G. Amdhal,et al.  Validity of the single processor approach to achieving large scale computing capabilities , 1967, AFIPS '67 (Spring).

[12]  Chen Yang,et al.  A cost-driven compilation framework for speculative parallelization of sequential programs , 2004, PLDI '04.

[13]  Josep Torrellas,et al.  Removing architectural bottlenecks to the scalability of speculative parallelization , 2001, Proceedings 28th Annual International Symposium on Computer Architecture.

[14]  Antonia Zhai,et al.  Compiler optimization of scalar value communication between speculative threads , 2002, ASPLOS X.

[15]  Gurindar S. Sohi,et al.  Speculative Versioning Cache , 2001, IEEE Trans. Parallel Distributed Syst..

[16]  Per Stenström,et al.  Intermediate checkpointing with conflicting access prediction in transactional memory systems , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[17]  Josep Torrellas,et al.  Architectural support for scalable speculative parallelization in shared-memory multiprocessors , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[18]  Antonia Zhai,et al.  Improving value communication for thread-level speculation , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.

[19]  Antonia Zhai,et al.  A scalable approach to thread-level speculation , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[20]  Per Stenström,et al.  Limits on speculative module-level parallelism in imperative and object-oriented programs on CMP platforms , 2001, Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques.

[21]  Amer Diwan,et al.  The DaCapo benchmarks: java benchmarking development and analysis , 2006, OOPSLA '06.

[22]  Monica S. Lam,et al.  In search of speculative thread-level parallelism , 1999, 1999 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00425).

[23]  Vivek Sarkar,et al.  Partitioning parallel programs for macro-dataflow , 1986, LFP '86.

[24]  M. Roth A quantitative assessment , 1987 .

[25]  Fredrik Larsson,et al.  Simics: A Full System Simulation Platform , 2002, Computer.

[26]  Antonio González,et al.  Clustered speculative multithreaded processors , 1999, ICS '99.

[27]  Josep Torrellas,et al.  Eliminating squashes through learning cross-thread violations in speculative parallelization for multiprocessors , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.

[28]  Wei Liu,et al.  Tasking with out-of-order spawn in TLS chip multiprocessors: microarchitecture and compilation , 2005, ICS '05.

[29]  Milind Girkar,et al.  On the performance potential of different types of speculative thread-level parallelism: The DL version of this paper includes corrections that were not made available in the printed proceedings , 2006, ICS '06.

[30]  Antonia Zhai,et al.  A general compiler framework for speculative optimizations using data speculative code motion , 2005, International Symposium on Code Generation and Optimization.

[31]  Rudolf Eigenmann,et al.  Min-cut program decomposition for thread-level speculation , 2004, PLDI '04.

[32]  Kunle Olukotun,et al.  Data speculation support for a chip multiprocessor , 1998, ASPLOS VIII.

[33]  Marcelo Cintra,et al.  Compiler Estimation of Load Imbalance Overhead in Speculative Parallelization , 2004, PACT 2004.

[34]  Todd C. Mowry,et al.  The potential for using thread-level data speculation to facilitate automatic parallelization , 1998, Proceedings 1998 Fourth International Symposium on High-Performance Computer Architecture.

[35]  Wei Liu,et al.  POSH: a TLS compiler that exploits program structure , 2006, PPoPP '06.

[36]  Rudolf Eigenmann,et al.  The Structure of a Compiler for Explicit and Implicit Parallelism , 2001, LCPC.

[37]  Todd C. Mowry,et al.  Tolerating Dependences Between Large Speculative Threads Via Sub-Threads , 2006, ISCA 2006.

[38]  Antonio González,et al.  A quantitative assessment of thread-level speculation techniques , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.

[39]  Albert Cohen,et al.  Induction Variable Analysis with Delayed Abstractions , 2005, HiPEAC.

[40]  Antonio González,et al.  Thread-spawning schemes for speculative multithreading , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.