Optimizing hybrid transactional memory: the importance of nonspeculative operations

Transactional memory (TM) is a speculative shared-memory synchronization mechanism used to speed up concurrent programs. Most current TM implementations are software-based (STM) and incur noticeable overheads for each transactional memory access. Hardware TM proposals (HTM) address this issue but typically suffer from other restrictions such as limits on the number of data locations that can be accessed in a transaction. In this paper, we present several new hybrid TM algorithms that can execute HTM and STM transactions concurrently and can thus provide good performance over a large spectrum of workloads. The algorithms exploit the ability of some HTMs to have both speculative and nonspeculative (nontransactional) memory accesses within a transaction to decrease the transactions' runtime overhead, abort rates, and hardware capacity requirements. We evaluate implementations of these algorithms based on AMD's Advanced Synchronization Facility, an x86 instruction set extension proposal that has been shown to provide a sound basis for HTM.

[1]  Torvald Riegel,et al.  A Lazy Snapshot Algorithm with Eager Validation , 2006, DISC.

[2]  Torvald Riegel,et al.  Time-Based Software Transactional Memory , 2010, IEEE Transactions on Parallel and Distributed Systems.

[3]  Sean White,et al.  Hybrid NOrec: a case study in the effectiveness of best effort hardware transactional memory , 2011, ASPLOS XVI.

[4]  Nir Shavit,et al.  Transactional Locking II , 2006, DISC.

[5]  Mark Moir,et al.  Hybrid transactional memory , 2006, ASPLOS XII.

[6]  Rui Zhang,et al.  Commit phase in timestamp-based stm , 2008, SPAA '08.

[7]  Torvald Riegel,et al.  Dynamic performance tuning of word-based software transactional memory , 2008, PPoPP.

[8]  Torvald Riegel,et al.  Evaluation of AMD's advanced synchronization facility within a complete transactional memory stack , 2010, EuroSys '10.

[9]  Michael F. Spear,et al.  NOrec: streamlining STM by abolishing ownership records , 2010, PPoPP '10.

[10]  Quinn Jacobson,et al.  Architectural Support for Software Transactional Memory , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[11]  Emmett Witchel,et al.  Maximum benefit from a minimal HTM , 2009, ASPLOS.

[12]  Mark Moir,et al.  Early experience with a commercial hardware transactional memory implementation , 2009, ASPLOS.

[13]  Michael Hohmuth,et al.  Implementing AMD’s Advanced Synchronization Facility in an out-of-order x86 core , 2010 .

[14]  Michael Hohmuth,et al.  Compilation of Thoughts about AMD Advanced Synchronization Facility and First-Generation Hardware Transactional Memory Support , 2010 .

[15]  Ravi Rajwar,et al.  Speculative lock elision: enabling highly concurrent multithreaded execution , 2001, Proceedings. 34th ACM/IEEE International Symposium on Microarchitecture. MICRO-34.

[16]  Michael F. Spear,et al.  Nonblocking transactions without indirection using alert-on-update , 2007, SPAA '07.

[17]  Adam Welc,et al.  Practical weak-atomicity semantics for java stm , 2008, SPAA '08.

[18]  Mark Moir,et al.  PhTM: Phased Transactional Memory , 2007 .

[19]  Kunle Olukotun,et al.  STAMP: Stanford Transactional Applications for Multi-Processing , 2008, 2008 IEEE International Symposium on Workload Characterization.

[20]  Torvald Riegel,et al.  Brief Announcement: Hybrid Time-Based Transactional Memory , 2010, DISC.

[21]  Kathryn S. McKinley,et al.  Hoard: a scalable memory allocator for multithreaded applications , 2000, SIGP.