Refined transactional lock elision

Transactional lock elision (TLE) is a well-known technique that exploits hardware transactional memory (HTM) to introduce concurrency into lock-based software. It achieves that by attempting to execute a critical section protected by a lock in an atomic hardware transaction, reverting to the lock if these attempts fail. One significant drawback of TLE is that it disables hardware speculation once there is a thread running under lock. In this paper we present two algorithms that rely on existing compiler support for transactional programs and allow threads to speculate concurrently on HTM along with a thread holding the lock. We demonstrate the benefit of our algorithms over TLE and other related approaches with an in-depth analysis of a number of benchmarks and a wide range of workloads, including an AVL tree-based micro-benchmark and ccTSA, a real sequence assembler application.

[1]  Michael F. Spear,et al.  Hybrid Transactional Memory Revisited , 2015, DISC.

[2]  James R. Goodman,et al.  Speculative lock elision: enabling highly concurrent multithreaded execution , 2001, MICRO.

[3]  Mark Moir,et al.  Hybrid transactional memory , 2006, ASPLOS XII.

[4]  Torvald Riegel,et al.  Optimizing hybrid transactional memory: the importance of nonspeculative operations , 2011, SPAA '11.

[5]  Michael Isard,et al.  Scalability! But at what COST? , 2015, HotOS.

[6]  Nir Shavit,et al.  Amalgamated Lock-Elision , 2015, DISC.

[7]  Michael F. Spear,et al.  NOrec: streamlining STM by abolishing ownership records , 2010, PPoPP '10.

[8]  Yehuda Afek,et al.  Software-improved hardware lock elision , 2014, PODC '14.

[9]  Sean White,et al.  Hybrid NOrec: a case study in the effectiveness of best effort hardware transactional memory , 2011, ASPLOS XVI.

[10]  Irina Calciu,et al.  Improved Single Global Lock Fallback for Best-effort Hardware Transactional Memory , 2014 .

[11]  Maged M. Michael,et al.  Quantitative comparison of Hardware Transactional Memory for Blue Gene/Q, zEnterprise EC12, Intel Core, and POWER8 , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[12]  Luís E. T. Rodrigues,et al.  Virtues and limitations of commodity hardware transactional memory , 2014, 2014 23rd International Conference on Parallel Architecture and Compilation (PACT).

[13]  Mark Moir,et al.  Adaptive integration of hardware and software lock elision techniques , 2014, SPAA.

[14]  Mark Moir,et al.  Pitfalls of lazy subscription , 2014 .

[15]  Mark Moir,et al.  Early experience with a commercial hardware transactional memory implementation , 2009, ASPLOS.

[16]  M. Frans Kaashoek,et al.  Scalable address spaces using RCU balanced trees , 2012, ASPLOS XVII.

[17]  Yehuda Afek,et al.  Reduced Hardware Lock Elision , 2014 .

[18]  Keir Fraser,et al.  Language support for lightweight transactions , 2003, SIGP.

[19]  Christopher J. Hughes,et al.  Performance evaluation of Intel® Transactional Synchronization Extensions for high-performance computing , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[20]  Jung Ho Ahn,et al.  ccTSA: A Coverage-Centric Threaded Sequence Assembler , 2012, PloS one.

[21]  Nuno Diegues,et al.  Self-Tuning Intel Transactional Synchronization Extensions , 2014, ICAC.

[22]  Nir Shavit,et al.  Reduced Hardware NOrec: A Safe and Scalable Hybrid Transactional Memory , 2015, ASPLOS.