A Cost Estimation Model for Speculative Thread Partitioning

Speculative Multithreading (SpMT) technology is an effective mechanism for parallelizing irregular programs which are hard by conventional approaches through allowing multiple threads to execute in the presence of ambiguous data and control dependences while the correctness of the programs is maintained by hardware support. Although speculative parallelization can potentially deliver significant speedup, several overheads associated with this technique can limit these speedups in practice. This paper proposes a novel cost estimation model for speculative thread partitioning which can be used to predict the resulting performance. Based on the analysis of the execution probability flow graph (EPFG) of each procedure, this model tries to divide the program’s execution time into sequential execution time and parallel execution time. Then, the model attempts to predict the theoretical speedup of the partitioned speculative procedures based on the estimation of the combined runtime effects of various overheads. Different from prior heuristics that only qualitatively estimate the benefits of speculative multithreaded execution, this model also produces a quantitative estimate of the speedup in theory. Experimental results show that the prediction accurately reflects the inherent parallelism of the thread partitioning results of the programs. Meanwhile, the predictive speedup also indicate the potential parallel performance of the thread partitioning results and then can assist to provide better guidance for thread partitioning.

[1]  Chen Yang,et al.  A cost-driven compilation framework for speculative parallelization of sequential programs , 2004, PLDI '04.

[2]  Michael F. P. O'Boyle,et al.  Towards a holistic approach to auto-parallelization: integrating profile-driven parallelism detection and machine-learning based mapping , 2009, PLDI '09.

[3]  YangChen,et al.  A cost-driven compilation framework for speculative parallelization of sequential programs , 2004 .

[4]  G. Amdhal,et al.  Validity of the single processor approach to achieving large scale computing capabilities , 1967, AFIPS '67 (Spring).

[5]  Zheng Chen,et al.  An Overview of Prophet , 2009, ICA3PP.

[6]  Manoj Franklin,et al.  A general compiler framework for speculative multithreaded processors , 2004, IEEE Transactions on Parallel and Distributed Systems.

[7]  Christoforos E. Kozyrakis,et al.  Heuristics for profile-driven method-level speculative parallelization , 2005, 2005 International Conference on Parallel Processing (ICPP'05).

[8]  Wei Liu,et al.  Tasking with out-of-order spawn in TLS chip multiprocessors: microarchitecture and compilation , 2005, ICS '05.

[9]  Wei Liu,et al.  POSH: a TLS compiler that exploits program structure , 2006, PPoPP '06.

[10]  Marcelo H. Cintra,et al.  A compiler cost model for speculative parallelization , 2007, TACO.

[11]  Gurindar S. Sohi,et al.  Task selection for a multiscalar processor , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.

[12]  John Paul Shen,et al.  Mitosis: A Speculative Multithreaded Processor Based on Precomputation Slices , 2008, IEEE Transactions on Parallel and Distributed Systems.

[13]  Zhaoyu Dong,et al.  Prophet: A Speculative Multi-threading Execution Model with Architectural Support Based on CMP , 2009, 2009 International Conference on Scalable Computing and Communications; Eighth International Conference on Embedded Computing.

[14]  Steven S. Muchnick,et al.  Advanced Compiler Design and Implementation , 1997 .

[15]  Babak Falsafi,et al.  Multiplex: unifying conventional and speculative thread-level parallelism on a chip multiprocessor , 2001, ICS '01.

[16]  Antonia Zhai,et al.  Compiler optimization of memory-resident value communication between speculative threads , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[17]  Per Stenström,et al.  Improving speculative thread-level parallelism through module run-length prediction , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[18]  Kunle Olukotun,et al.  TEST: a Tracer for Extracting Speculative Threads , 2003, International Symposium on Code Generation and Optimization, 2003. CGO 2003..