Using Heuristic Value Prediction and Dynamic Task Granularity Resizing to Improve Software Speculation

Exploiting potential thread-level parallelism (TLP) is becoming the key factor to improving performance of programs on multicore or many-core systems. Among various kinds of parallel execution models, the software-based speculative parallel model has become a research focus due to its low cost, high efficiency, flexibility, and scalability. The performance of the guest program under the software-based speculative parallel execution model is closely related to the speculation accuracy, the control overhead, and the rollback overhead of the model. In this paper, we first analyzed the conventional speculative parallel model and presented an analytic model of its expectation of the overall overhead, then optimized the conventional model based on the analytic model, and finally proposed a novel speculative parallel model named HEUSPEC. The HEUSPEC model includes three key techniques, namely, the heuristic value prediction, the value based correctness checking, and the dynamic task granularity resizing. We have implemented the runtime system of the model in ANSI C language. The experiment results show that when the speedup of the HEUSPEC model can reach 2.20 on the average (15% higher than conventional model) when depth is equal to 3 and 4.51 on the average (12% higher than conventional model) when speculative depth is equal to 7. Besides, it shows good scalability and lower memory cost.

[1]  Nikolas Ioannou,et al.  Complementing user-level coarse-grain parallelism with implicit speculative parallelism , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[2]  Todd C. Mowry,et al.  The potential for using thread-level data speculation to facilitate automatic parallelization , 1998, Proceedings 1998 Fourth International Symposium on High-Performance Computer Architecture.

[3]  Kunle Olukotun,et al.  Transactional memory coherence and consistency , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[4]  Lei Liu,et al.  Safe parallel programming using dynamic dependence hints , 2011, OOPSLA '11.

[5]  Kunle Olukotun,et al.  The Stanford Hydra CMP , 2000, IEEE Micro.

[6]  Christina Freytag,et al.  Using Mpi Portable Parallel Programming With The Message Passing Interface , 2016 .

[7]  Margaret Martonosi,et al.  Characterizing and improving the performance of Intel Threading Building Blocks , 2008, 2008 IEEE International Symposium on Workload Characterization.

[8]  Rajiv Gupta,et al.  Copy or Discard execution model for speculative parallelization on multicores , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.

[9]  Antonia Zhai,et al.  A scalable approach to thread-level speculation , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[10]  Rajiv Gupta,et al.  SpiceC: scalable parallelism via implicit copying and explicit commit , 2011, PPoPP '11.

[11]  David A. Wood,et al.  LogTM: log-based transactional memory , 2006, The Twelfth International Symposium on High-Performance Computer Architecture, 2006..

[12]  Antonia Zhai,et al.  Loop Selection for Thread-Level Speculation , 2005, LCPC.

[13]  Jean-Luc Gaudiot,et al.  Speculative Execution on GPU: An Exploratory Study , 2010, 2010 39th International Conference on Parallel Processing.

[14]  Rajiv Gupta,et al.  Enhanced speculative parallelization via incremental recovery , 2011, PPoPP '11.

[15]  Monica S. Lam,et al.  In search of speculative thread-level parallelism , 1999, 1999 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00425).

[16]  Nir Shavit Software transactional memory: Where do we come from? What are we? Where are we going? , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[17]  L. Dagum,et al.  OpenMP: an industry standard API for shared-memory programming , 1998 .

[18]  Quinn Jacobson,et al.  Architectural Support for Software Transactional Memory , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[19]  Chen Ding,et al.  Software behavior oriented parallelization , 2007, PLDI '07.

[20]  Kunle Olukotun,et al.  Using thread-level speculation to simplify manual parallelization , 2003, PPoPP '03.