Concurrent Irrevocability in Best-Effort Hardware Transactional Memory

Existing best-effort requester-wins implementations of transactional memory must resort to non-speculative execution to provide forward progress in the presence of transactions that exceed hardware capacity, experience page faults or suffer high-contention leading to livelocks. Current approaches to irrevocability employ lock-based synchronization to achieve mutual exclusion when executing a transaction non-speculatively, conservatively precluding concurrency with any other transactions in order to guarantee atomicity at the cost of degrading performance. In this article, we propose a new form of concurrent irrevocability whose goal is to minimize the loss of concurrency paid when transactions resort to irrevocability to complete. By enabling optimistic concurrency control also during non-speculative execution of a transaction, our proposal allows for higher parallelism than existing schemes. We describe the extensions to the instruction set to provide concurrent irrevocable transactions as well as the architectural extensions required to realize them on a best-effort HTM system without requiring any modification to the cache coherence protocol. Our evaluation shows that our proposal achieves an average reduction of 12.5 percent in execution time across the STAMP benchmarks, with 15.8 percent on average for highly contended workloads.

[1]  Somayeh Sardashti,et al.  The gem5 simulator , 2011, CARN.

[2]  David A. Wood,et al.  Performance Pathologies in Hardware Transactional Memory , 2007, IEEE Micro.

[3]  Timothy J. Slegel,et al.  Transactional Memory Architecture and Implementation for IBM System Z , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.

[4]  Maurice Herlihy,et al.  Transactional Memory: Architectural Support For Lock-free Data Structures , 1993, Proceedings of the 20th Annual International Symposium on Computer Architecture.

[5]  David A. Wood,et al.  LogTM-SE: Decoupling Hardware Transactional Memory from Caches , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.

[6]  Sally A. McKee,et al.  Performance and Energy Analysis of the Restricted Transactional Memory Implementation on Haswell , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.

[7]  David Dice,et al.  Refined transactional lock elision , 2016, PPOPP.

[8]  Irina Calciu,et al.  Improved Single Global Lock Fallback for Best-effort Hardware Transactional Memory , 2014 .

[9]  Milos Prvulovic,et al.  Transactional pre-abort handlers in hardware transactional memory , 2018, PACT.

[10]  Roberto Palmieri,et al.  Managing Resource Limitation of Best-Effort HTM , 2017, IEEE Transactions on Parallel and Distributed Systems.

[11]  Christopher J. Hughes,et al.  Performance evaluation of Intel® Transactional Synchronization Extensions for high-performance computing , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[12]  Emilio L. Zapata,et al.  Lazy Irrevocability for Best-Effort Transactional Memory Systems , 2017, IEEE Transactions on Parallel and Distributed Systems.

[13]  Milos Prvulovic,et al.  PleaseTM: Enabling transaction conflict management in requester-wins hardware transactional memory , 2016, 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[14]  David A. Wood,et al.  LogTM: log-based transactional memory , 2006, The Twelfth International Symposium on High-Performance Computer Architecture, 2006..

[15]  Mark Moir,et al.  Pitfalls of lazy subscription , 2014 .

[16]  Mark Moir,et al.  Early experience with a commercial hardware transactional memory implementation , 2009, ASPLOS.

[17]  Luke Dalessandro Michael,et al.  Strong Isolation is a Weak Idea , 2009 .

[18]  Kunle Olukotun,et al.  STAMP: Stanford Transactional Applications for Multi-Processing , 2008, 2008 IEEE International Symposium on Workload Characterization.

[19]  Nir Shavit,et al.  Amalgamated Lock-Elision , 2015, DISC.

[20]  Maurice Herlihy,et al.  Improving Parallelism in Hardware Transactional Memory , 2018, ACM Trans. Archit. Code Optim..

[21]  Rachid Guerraoui,et al.  Predicting the Scalability of an STM: A Pragmatic Approach , 2010 .

[22]  Sean White,et al.  Hybrid NOrec: a case study in the effectiveness of best effort hardware transactional memory , 2011, ASPLOS XVI.