Transactional memory: models and algorithms

Modern multicore architectures enable the concurrent execution of an unprecedented number of threads. This gives rise to the opportunity for extreme performance and the complex challenge of synchronization. Conventional lock-based synchronization has several drawbacks which limits the parallelism offered by multicore architectures. Coarse-grained locks do not scale. Fine-grained locks are difficult to program correctly because locks are generally not composable. Transactional memory (TM) [45, 81] provides an alternative synchronization mechanism that is non-blocking, composable, and easier to write than lock-based code [64]. TM-based synchronization has recently been included in IBM’s Blue Gene/Q [39, 84] and Intel’s Haswell processors [20]. TM is predicted to be widely used in future processors, possibly even GPUs [32, 86]. In the research community, several TM implementations (hardware, software, and hybrid) have been proposed and studied, e.g., [16, 24, 26, 30, 31, 43, 44, 60]. The TM book by Harris et al. [40] provides an excellent overview of the design and implementation of TM systems up to early spring 2010. TM operates in a way similar to database transactions, and aggregates a sequence of shared resource accesses (reads or writes) that should be executed atomically (by a single thread) in a fundamental module called transaction. A transaction is a piece of code that executes a series of reads and writes to shared memory. These reads and writes logically occur as a transaction at a single instance in time; intermediate states are not visible to other (successful) transactions. TM increases parallelism as no threads need to wait for access to a shared resource and different threads can simultaneously modify disjoint parts of a data structure that would normally be protected under the same lock. A transaction ends either by committing, in which case all of the updates take effect, or by aborting, in which case no update is effective. Each program thread generates a sequence of transactions. Transactions of the same thread execute sequentially by following the program execution flow. However, transactions of different threads may conflict when they attempt to access the same shared memory resources. The advantage of TM is that if there are no conflicts

[1]  Chen Ding,et al.  A Key-based Adaptive Transactional Memory Executor , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[2]  Rachid Guerraoui,et al.  Polymorphic Contention Management , 2005, DISC.

[3]  Donald E. Porter,et al.  MetaTM/TxLinux: Transactional Memory for an Operating System , 2008, IEEE Micro.

[4]  Binoy Ravindran,et al.  Brief Announcement: Relay: A Cache-Coherence Protocol for Distributed Transactional Memory , 2009, OPODIS.

[5]  Andrew Brownsword,et al.  Hardware transactional memory for GPU architectures , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[6]  João P. Cachopo,et al.  Versioned boxes as the basis for memory transactions , 2006, Sci. Comput. Program..

[7]  F. Leighton,et al.  Introduction to Parallel Algorithms and Architectures: Arrays, Trees, Hypercubes , 1991 .

[8]  Maurice Herlihy,et al.  The Arrow Distributed Directory Protocol , 1998, DISC.

[9]  Gokarna Sharma,et al.  Window-based greedy contention management for transactional memory: theory and practice , 2012, Distributed Computing.

[10]  Maurice Herlihy,et al.  Robust Contention Management in Software Transactional Memory , 2005, OOPSLA 2005.

[11]  Gokarna Sharma,et al.  On the Performance of Window-Based Contention Managers for Transactional Memory , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.

[12]  Kunle Olukotun,et al.  STAMP: Stanford Transactional Applications for Multi-Processing , 2008, 2008 IEEE International Symposium on Workload Characterization.

[13]  Kai Lu,et al.  Brief announcement: NUMA-aware transactional memory , 2010, PODC '10.

[14]  Mohamed M. Saad Supporting STM in Distributed Systems : Mechanisms and a Java Framework , 2011 .

[15]  Bruce M. Maggs,et al.  Exploiting locality for data management in systems of limited bandwidth , 1997, Proceedings 38th Annual Symposium on Foundations of Computer Science.

[16]  Roger Wattenhofer,et al.  Bounds on Contention Management Algorithms , 2009, ISAAC.

[17]  Ronald G. Dreslinski,et al.  Bloom Filter Guided Transaction Scheduling , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.

[18]  Ronald G. Dreslinski,et al.  Proactive transaction scheduling for contention management , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[19]  Michael F. Spear,et al.  Conflict Detection and Validation Strategies for Software Transactional Memory , 2006, DISC.

[20]  Hagit Attiya The inherent complexity of transactional memory and what to do about it , 2010, PODC '10.

[21]  Rachid Guerraoui,et al.  Preventing versus curing: avoiding conflicts in transactional memories , 2009, PODC '09.

[22]  David B. Shmoys,et al.  Improved Lower Bounds for the Universal and a priori TSP , 2010, APPROX-RANDOM.

[23]  William N. Scherer,et al.  Advanced contention management for dynamic software transactional memory , 2005, PODC '05.

[24]  David Hasenfratz,et al.  Transactional Memory: How to perform load adaption in a simple and distributed manner , 2010, 2010 International Conference on High Performance Computing & Simulation.

[25]  Nir Shavit,et al.  NUMA-aware reader-writer locks , 2013, PPoPP '13.

[26]  Kerry Raymond,et al.  A tree-based algorithm for distributed mutual exclusion , 1989, TOCS.

[27]  Torvald Riegel,et al.  Time-Based Software Transactional Memory , 2010, IEEE Transactions on Parallel and Distributed Systems.

[28]  Maged M. Michael,et al.  Evaluation of Blue Gene/Q hardware support for transactional memories , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).

[29]  Amith R. Mamidala,et al.  Looking under the hood of the IBM Blue Gene/Q network , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[30]  Guevara Noubir,et al.  Universal approximations for TSP, Steiner tree, and set cover , 2005, STOC '05.

[31]  David A. Wood,et al.  TokenTM: Efficient Execution of Large Transactions with Hardware Transactional Memory , 2008, 2008 International Symposium on Computer Architecture.

[32]  James R. Larus,et al.  Transactional Memory, 2nd edition , 2010, Transactional Memory.

[33]  Ye Sun,et al.  Distributed transactional memory for metric-space networks , 2005, Distributed Computing.

[34]  Michael F. Spear,et al.  NOrec: streamlining STM by abolishing ownership records , 2010, PPoPP '10.

[35]  Luís E. T. Rodrigues,et al.  D2STM: Dependable Distributed Software Transactional Memory , 2009, 2009 15th IEEE Pacific Rim International Symposium on Dependable Computing.

[36]  Bradford L. Chamberlain,et al.  Software transactional memory for large scale clusters , 2008, PPoPP.

[37]  Mohammad Taghi Hajiaghayi,et al.  Improved lower and upper bounds for universal TSP in planar metrics , 2006, SODA '06.

[38]  William N. Scherer,et al.  Contention Management in Dynamic Software Transactional Memory ∗ , 2004 .

[39]  Mikel Luján,et al.  DiSTM: A Software Transactional Memory Framework for Clusters , 2008, 2008 37th International Conference on Parallel Processing.

[40]  Subhash Khot,et al.  Improved inapproximability results for MaxClique, chromatic number and approximate graph coloring , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[41]  Gokarna Sharma,et al.  Window-Based Greedy Contention Management for Transactional Memory , 2010, DISC.

[42]  Binoy Ravindran,et al.  Scheduling Transactions in Replicated Distributed Software Transactional Memory , 2013, 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing.

[43]  David Eisenstat,et al.  Lowering the Overhead of Nonblocking Software Transactional Memory , 2006 .

[44]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[45]  Rachid Guerraoui,et al.  Toward a theory of transactional contention managers , 2005, PODC '05.

[46]  Alexandra Fedorova,et al.  A case for NUMA-aware contention management on multicore systems , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).

[47]  Mikel Luján,et al.  Steal-on-Abort: Improving Transactional Memory Performance through Dynamic Transaction Reordering , 2008, HiPEAC.

[48]  Hagit Attiya,et al.  Transactional Contention Management as a Non-Clairvoyant Scheduling Problem , 2010, Algorithmica.

[49]  Hagit Attiya,et al.  A Provably Starvation-Free Distributed Directory Protocol , 2010, SSS.

[50]  Maurice Herlihy,et al.  Transactional Memory: Architectural Support For Lock-free Data Structures , 1993, Proceedings of the 20th Annual International Symposium on Computer Architecture.

[51]  Binoy Ravindran,et al.  Snake: Control Flow Distributed Software Transactional Memory , 2011, SSS.

[52]  Nir Shavit,et al.  Software transactional memory , 1995, PODC '95.

[53]  Noga Alon,et al.  Lower bounds on the competitive ratio for mobile user tracking and distributed job scheduling , 1992, Proceedings., 33rd Annual Symposium on Foundations of Computer Science.

[54]  Michael F. Spear,et al.  A comprehensive strategy for contention management in software transactional memory , 2009, PPoPP '09.

[55]  Gokarna Sharma,et al.  Towards Load Balanced Distributed Transactional Memory , 2012, Euro-Par.

[56]  Danny Hendler,et al.  CAR-STM: scheduling-based collision avoidance and resolution for software transactional memory , 2008, PODC '08.

[57]  Madalin Mihailescu,et al.  Exploiting distributed version concurrency in a transactional memory cluster , 2006, PPoPP '06.

[58]  Maurice Herlihy,et al.  Competitive concurrent distributed queuing , 2001, PODC '01.

[59]  Emmett Witchel,et al.  Is transactional programming actually easier? , 2010, PPoPP '10.

[60]  Gokarna Sharma,et al.  An Analysis Framework for Distributed Hierarchical Directories , 2013, ICDCN.

[61]  Hsien-Hsin S. Lee,et al.  Adaptive transaction scheduling for transactional memory systems , 2008, SPAA '08.

[62]  Hagit Attiya,et al.  Transactional scheduling for read-dominated workloads , 2009, J. Parallel Distributed Comput..

[63]  Mikel Luján,et al.  On the Performance of Contention Managers for Complex Transactional Memory Benchmarks , 2009, 2009 Eighth International Symposium on Parallel and Distributed Computing.

[64]  Costas Busch,et al.  Optimal Oblivious Path Selection on the Mesh , 2008, IEEE Transactions on Computers.

[65]  Maurice Herlihy,et al.  Software transactional memory for dynamic-sized data structures , 2003, PODC '03.

[66]  Philip Heidelberger,et al.  Blue Gene/L torus interconnection network , 2005, IBM J. Res. Dev..

[67]  Håkan Grahn,et al.  Transactional memory , 2010, J. Parallel Distributed Comput..

[68]  Danny Hendler,et al.  Exploiting Locality in Lease-Based Replicated Transactional Memory via Task Migration , 2013, DISC.

[69]  Hagit Attiya,et al.  R EL STM : A Proactive Transactional Memory Scheduler ∗ , 2013 .

[70]  Danny Hendler,et al.  Scheduling support for transactional memory contention management , 2010, PPoPP '10.

[71]  Michael Gschwind,et al.  The IBM Blue Gene/Q Compute Chip , 2012, IEEE Micro.

[72]  Torvald Riegel,et al.  Dynamic performance tuning of word-based software transactional memory , 2008, PPoPP.

[73]  Nir Shavit,et al.  Transactional Locking II , 2006, DISC.

[74]  Maurice Herlihy,et al.  Dynamic Analysis of the Arrow Distributed Protocol , 2006, Theory of Computing Systems.

[75]  Kai Lu,et al.  Investigating transactional memory performance on ccNUMA machines , 2009, HPDC '09.

[76]  Luís E. T. Rodrigues,et al.  Cloud-TM: harnessing the cloud with distributed transactional memories , 2010, OPSR.

[77]  Binoy Ravindran,et al.  On Transactional Scheduling in Distributed Transactional Memory Systems , 2010, SSS.

[78]  Gokarna Sharma,et al.  Distributed transactional memory for general networks , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.

[79]  Binoy Ravindran,et al.  Dynamic analysis of the relay cache-coherence protocol for distributed transactional memory , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[80]  Depei Qian,et al.  Software Transactional Memory for GPU Architectures , 2014, IEEE Computer Architecture Letters.

[81]  Mohammad Taghi Hajiaghayi,et al.  Oblivious network design , 2006, SODA '06.