Study of hardware transactional memory characteristics and serialization policies on Haswell

We evaluated the strengths and weaknesses of Intel extensions to HTM - TSX.We described features that are likely to yield performance gains when using TSX.We explored with the aid of a new tool called htm-pBuilder the performance of TSX.We introduced a efficient policy for guaranteeing forward progress on top of TSX.We explored various fall-back policy tunings and transaction properties of TSX. This paper presents an extensive performance study of the implementation of Hardware Transactional Memory (HTM) in the Haswell generation of Intel x86 core processors. It evaluates the strengths and weaknesses of this new architecture by exploring several dimensions in the space of Transactional Memory (TM) application characteristics using the Eigenbench?(Hong et?al., 2010 1) and the CLOMP-TM?(Schindewolf et?al., 2012 2), benchmarks. This paper also introduces a new tool, called htm-pBuilder that tailors fallback policies and allows independent exploration of its parameters.This detailed performance study provides insights on the constraints imposed by the Intel's Transaction Synchronization Extension (Intel's TSX) and introduces a simple, but efficient policy for guaranteeing forward progress on top of the best-effort Intel's HTM which was critical to achieving performance. The evaluation also shows that there are a number of potential improvements for designers of TM applications and software systems that use Intel's TM and provides recommendations to extract maximum benefit from the current TM support available in Haswell.

[1]  Maurice Herlihy,et al.  Invyswell: A hybrid transactional memory for Haswell's restricted transactional memory , 2014, 2014 23rd International Conference on Parallel Architecture and Compilation (PACT).

[2]  Mike Dai Wang Exploring the Performance and Programmability Design Space of Hardware Transactional Memory , 2014 .

[3]  David A. Wood,et al.  LogTM-SE: Decoupling Hardware Transactional Memory from Caches , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.

[4]  Dan Grossman,et al.  ASF: AMD64 Extension for Lock-Free Data Structures and Transactional Memory , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.

[5]  Maged M. Michael,et al.  Evaluation of Blue Gene/Q hardware support for transactional memories , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).

[6]  Jonathan Richard Shewchuk,et al.  Triangle: Engineering a 2D Quality Mesh Generator and Delaunay Triangulator , 1996, WACG.

[7]  Martin Schulz,et al.  What scientific applications can benefit from hardware transactional memory? , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[8]  Kunle Olukotun,et al.  Eigenbench: A simple exploration tool for orthogonal TM characteristics , 2010, IEEE International Symposium on Workload Characterization (IISWC'10).

[9]  Josep Torrellas,et al.  Bulk Disambiguation of Speculative Threads in Multiprocessors , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).

[10]  Christopher J. Hughes,et al.  Performance evaluation of Intel® Transactional Synchronization Extensions for high-performance computing , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[11]  David A. Wood,et al.  LogTM: log-based transactional memory , 2006, The Twelfth International Symposium on High-Performance Computer Architecture, 2006..

[12]  Kunle Olukotun,et al.  STAMP: Stanford Transactional Applications for Multi-Processing , 2008, 2008 IEEE International Symposium on Workload Characterization.

[13]  Torvald Riegel,et al.  Time-Based Software Transactional Memory , 2010, IEEE Transactions on Parallel and Distributed Systems.

[14]  Gokcen Kestor RMS-TM : A TRANSACTIONAL MEMORY BENCHMARK FOR RECOGNITION , MINING AND SYNTHESIS APPLICATIONS , 2009 .

[15]  Yehuda Afek,et al.  Programming with hardware lock elision , 2013, PPoPP '13.

[16]  David A. Wood,et al.  Performance Pathologies in Hardware Transactional Memory , 2007, IEEE Micro.

[17]  Manish Vachharajani,et al.  An efficient software transactional memory using commit-time invalidation , 2010, CGO '10.

[18]  Nuno Diegues,et al.  Self-Tuning Intel Transactional Synchronization Extensions , 2014, ICAC.

[19]  Kunle Olukotun,et al.  Transactional memory coherence and consistency , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[20]  Maurice Herlihy,et al.  Transactional Memory: Architectural Support For Lock-free Data Structures , 1993, Proceedings of the 20th Annual International Symposium on Computer Architecture.