Mitosis compiler: an infrastructure for speculative threading based on pre-computation slices

Speculative parallelization can provide significant sources of additional thread-level parallelism, especially for irregular applications that are hard to parallelize by conventional approaches. In this paper, we present the Mitosis compiler, which partitions applications into speculative threads, with special emphasis on applications for which conventional parallelizing approaches fail.The management of inter-thread data dependences is crucial for the performance of the system. The Mitosis framework uses a pure software approach to predict/compute the thread's input values. This software approach is based on the use of pre-computation slices (p-slices), which are built by the Mitosis compiler and added at the beginning of the speculative thread. P-slices must compute thread input values accurately but they do not need to guarantee correctness, since the underlying architecture can detect and recover from misspeculations. This allows the compiler to use aggressive/unsafe optimizations to significantly reduce their overhead. The most important optimizations included in the Mitosis compiler and presented in this paper are branch pruning, memory and register dependence speculation, and early thread squashing.Performance evaluation of Mitosis compiler/architecture shows an average speedup of 2.2.

[1]  Gurindar S. Sohi,et al.  The Expandable Split Window Paradigm for Exploiting Fine-grain Parallelism , 1992, [1992] Proceedings the 19th Annual International Symposium on Computer Architecture.

[2]  Kunle Olukotun,et al.  Software and Hardware for Exploiting Speculative Parallelism with a Multiprocessor , 1997 .

[3]  Haitham Akkary,et al.  A dynamic multithreading processor , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.

[4]  Craig Zilles,et al.  Execution-based prediction using speculative slices , 2001, ISCA 2001.

[5]  Gurindar S. Sohi,et al.  Master/Slave Speculative Parallelization , 2002, 35th Annual IEEE/ACM International Symposium on Microarchitecture, 2002. (MICRO-35). Proceedings..

[6]  Josep Torrellas,et al.  Eliminating squashes through learning cross-thread violations in speculative parallelization for multiprocessors , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.

[7]  Gurindar S. Sohi,et al.  The expandable split window paradigm for exploiting fine-grain parallelsim , 1992, ISCA '92.

[8]  D. Marr,et al.  Hyper-Threading Technology Architecture and MIcroarchitecture , 2002 .

[9]  Josep Torrellas,et al.  Architectural support for scalable speculative parallelization in shared-memory multiprocessors , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[10]  Antonia Zhai,et al.  Improving value communication for thread-level speculation , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.

[11]  Chi-Keung Luk,et al.  Tolerating memory latency through software-controlled pre-execution in simultaneous multithreading processors , 2001, Proceedings 28th Annual International Symposium on Computer Architecture.

[12]  Antonio González,et al.  Thread-spawning schemes for speculative multithreading , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.

[13]  Marc Tremblay,et al.  The MAJC Architecture: A Synthesis of Parallelism and Scalability , 2000, IEEE Micro.

[14]  D. Scott Wills,et al.  On dynamic speculative thread partitioning and the MEM-slicing algorithm , 1999, 1999 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00425).

[15]  Gurindar S. Sohi,et al.  Speculative data-driven multithreading , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.

[16]  Yale N. Patt,et al.  Simultaneous subordinate microthreading (SSMT) , 1999, ISCA.

[17]  Chen Yang,et al.  A cost-driven compilation framework for speculative parallelization of sequential programs , 2004, PLDI '04.

[18]  Dean M. Tullsen,et al.  Simultaneous multithreading: Maximizing on-chip parallelism , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.

[19]  Gurindar S. Sohi,et al.  Compiling for the multiscalar architecture , 1998 .

[20]  Antonio González,et al.  Value prediction for speculative multithreaded architectures , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.

[21]  John Paul Shen,et al.  Speculative precomputation: long-range prefetching of delinquent loads , 2001, Proceedings 28th Annual International Symposium on Computer Architecture.

[22]  Todd C. Mowry,et al.  The potential for using thread-level data speculation to facilitate automatic parallelization , 1998, Proceedings 1998 Fourth International Symposium on High-Performance Computer Architecture.

[23]  Kevin O'Brien,et al.  Single-program speculative multithreading (SPSM) architecture: compiler-assisted fine-grained multithreading , 1995, PACT.

[24]  Jenn-Yuan Tsai,et al.  The superthreaded architecture: thread pipelining with run-time data dependence checking and control speculation , 1996, Proceedings of the 1996 Conference on Parallel Architectures and Compilation Technique.

[25]  Antonio González,et al.  Clustered speculative multithreaded processors , 1999, ICS '99.