$$\hbox {TM}^{2}$$TM2C: a software transactional memory for many-cores

Transactional memory is an appealing paradigm for concurrent systems. Many software implementations of the paradigm were proposed in the past two decades for both shared memory multi-core systems and clusters of distributed machines. Chip manufacturers have however started producing many-core architectures, with low network-on-chip communication latencies and limited support for cache coherence, rendering existing transactional-memory implementations inapplicable. This paper presents $$\hbox {TM}^{2}\hbox {C}$$TM2C, the first software transactional memory protocol for many-core systems, hence featuring transactions that are both distributed and leverage shared memory. $$\hbox {TM}^{2}\hbox {C}$$TM2C exploits fast messages over network-on-chip to make accesses to shared data coherent. In particular, it allows visible read accesses to detect conflicts eagerly and incorporates the first distributed contention manager that guarantees the commit of all transactions. We evaluate $$\hbox {TM}^{2}\hbox {C}$$TM2C on Intel, AMD and Tilera architectures, ranging from common multi-cores to experimental many-cores. We build upon new message-passing protocols, based on both software and hardware, which are interesting in their own right. Our results on various benchmarks, including realistic banking and MapReduce applications, show that $$\hbox {TM}^{2}\hbox {C}$$TM2C scales well regardless of the underlying platform.

[1]  Paolo Romano,et al.  Towards distributed software transactional memory systems , 2008, LADIS '08.

[2]  Michael F. Spear,et al.  NOrec: streamlining STM by abolishing ownership records , 2010, PPoPP '10.

[3]  James R. Larus,et al.  Transactional Memory, 2nd edition , 2010, Transactional Memory.

[4]  Luís E. T. Rodrigues,et al.  D2STM: Dependable Distributed Software Transactional Memory , 2009, 2009 15th IEEE Pacific Rim International Symposium on Dependable Computing.

[5]  Eitan Frachtenberg,et al.  Many-core key-value store , 2011, 2011 International Green Computing Conference and Workshops.

[6]  William N. Scherer,et al.  Advanced contention management for dynamic software transactional memory , 2005, PODC '05.

[7]  Adrian Schüpbach,et al.  The multikernel: a new OS architecture for scalable multicore systems , 2009, SOSP '09.

[8]  Anoop Gupta,et al.  The directory-based cache coherence protocol for the DASH multiprocessor , 1990, ISCA '90.

[9]  Coniferous softwood GENERAL TERMS , 2003 .

[10]  Bin Fan,et al.  MemC3: Compact and Concurrent MemCache with Dumber Caching and Smarter Hashing , 2013, NSDI.

[11]  Nir Shavit,et al.  Software transactional memory , 1995, PODC '95.

[12]  Tudor David,et al.  Everything you always wanted to know about synchronization but were afraid to ask , 2013, SOSP.

[13]  Hagit Attiya,et al.  Brief announcement: combine -- an improved directory-based consistency protocol , 2010, SPAA '10.

[14]  Janak H. Patel,et al.  A low-overhead coherence solution for multiprocessors with private cache memories , 1984, ISCA '84.

[15]  Robert Tappan Morris,et al.  An Analysis of Linux Scalability to Many Cores , 2010, OSDI.

[16]  Milo M. K. Martin,et al.  Why on-chip cache coherence is here to stay , 2012, Commun. ACM.

[17]  Hagit Attiya,et al.  A Provably Starvation-Free Distributed Directory Protocol , 2010, SSS.

[18]  Babak Falsafi,et al.  Quantifying the Mismatch between Emerging Scale-Out Applications and Modern Processors , 2012, TOCS.

[19]  Marek Olszewski,et al.  JudoSTM: A Dynamic Binary-Rewriting Approach to Software Transactional Memory , 2007, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007).

[20]  Binoy Ravindran,et al.  Snake: Control Flow Distributed Software Transactional Memory , 2011, SSS.

[21]  Sayantan Sur,et al.  Memcached Design on High Performance RDMA Capable Interconnects , 2011, 2011 International Conference on Parallel Processing.

[22]  Natalie D. Enright Jerger,et al.  Achieving predictable performance through better memory controller placement in many-core CMPs , 2009, ISCA '09.

[23]  Madalin Mihailescu,et al.  Exploiting distributed version concurrency in a transactional memory cluster , 2006, PPoPP '06.

[24]  Maged M. Michael,et al.  Evaluation of Blue Gene/Q hardware support for transactional memories , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).

[25]  William N. Scherer,et al.  Contention Management in Dynamic Software Transactional Memory ∗ , 2004 .

[26]  Mikel Luján,et al.  DiSTM: A Software Transactional Memory Framework for Clusters , 2008, 2008 37th International Conference on Parallel Processing.

[27]  Nir Shavit,et al.  TLRW: return of the read-write lock , 2010, SPAA '10.

[28]  Maurice Herlihy,et al.  The art of multiprocessor programming , 2020, PODC '06.

[29]  Nir Shavit,et al.  Transactional Locking II , 2006, DISC.

[30]  Dhabaleswar K. Panda,et al.  Sockets Direct Protocol over InfiniBand in clusters: is it beneficial? , 2004, IEEE International Symposium on - ISPASS Performance Analysis of Systems and Software, 2004.

[31]  Bo Zhang On the Design of Contention Managers and Cache-Coherence Protocols for Distributed Transactional Memory , 2009 .

[32]  Milo M. K. Martin,et al.  Subtleties of transactional memory atomicity semantics , 2006, IEEE Computer Architecture Letters.

[33]  Ravi Rajwar,et al.  Speculative lock elision: enabling highly concurrent multithreaded execution , 2001, Proceedings. 34th ACM/IEEE International Symposium on Microarchitecture. MICRO-34.

[34]  Ippokratis Pandis,et al.  Scalability of write-ahead logging on multicore and multisocket hardware , 2012, The VLDB Journal.

[35]  Luís E. T. Rodrigues,et al.  Asynchronous Lease-Based Replication of Software Transactional Memory , 2010, Middleware.

[36]  Saurabh Dighe,et al.  The 48-core SCC Processor: the Programmer's View , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[37]  Torvald Riegel,et al.  Dynamic performance tuning of word-based software transactional memory , 2008, PPoPP.

[38]  Kevin M. Lepak,et al.  Cache Hierarchy and Memory Subsystem of the AMD Opteron Processor , 2010, IEEE Micro.

[39]  Bin Fan,et al.  SILT: a memory-efficient, high-performance key-value store , 2011, SOSP.

[40]  Binoy Ravindran,et al.  Brief Announcement: Relay: A Cache-Coherence Protocol for Distributed Transactional Memory , 2009, OPODIS.

[41]  Rachid Guerraoui,et al.  Principles of Transactional Memory , 2010, Synthesis Lectures on Distributed Computing Theory.

[42]  Annette Bieniusa,et al.  Consistency in hindsight: A fully decentralized STM algorithm , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[43]  Sarita V. Adve,et al.  DeNovo: Rethinking the Memory Hierarchy for Disciplined Parallelism , 2011, 2011 International Conference on Parallel Architectures and Compilation Techniques.

[44]  Rudolf Bayer,et al.  Concurrency of operations on B-trees , 1994, Acta Informatica.

[45]  Rachid Guerraoui,et al.  TM2C: a software transactional memory for many-cores , 2012, EuroSys '12.

[46]  Michael L. Scott,et al.  Algorithms for scalable synchronization on shared-memory multiprocessors , 1991, TOCS.

[47]  Andrew A. Chien,et al.  The future of microprocessors , 2011, Commun. ACM.

[48]  Rachid Guerraoui,et al.  The semantics of progress in lock-based transactional memory , 2009, POPL '09.

[49]  Maged M. Michael Hazard pointers: safe memory reclamation for lock-free objects , 2004, IEEE Transactions on Parallel and Distributed Systems.

[50]  Adam Welc,et al.  Irrevocable transactions and their applications , 2008, SPAA '08.

[51]  Timothy Mattson,et al.  A 48-Core IA-32 message-passing processor with DVFS in 45nm CMOS , 2010, 2010 IEEE International Solid-State Circuits Conference - (ISSCC).

[52]  Luís E. T. Rodrigues,et al.  Cloud-TM: harnessing the cloud with distributed transactional memories , 2010, OPSR.

[53]  Luís E. T. Rodrigues,et al.  A Generic Framework for Replicated Software Transactional Memories , 2011, 2011 IEEE 10th International Symposium on Network Computing and Applications.

[54]  Timothy J. Slegel,et al.  Transactional Memory Architecture and Implementation for IBM System Z , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.

[55]  Michael F. Spear,et al.  Privatization techniques for software transactional memory , 2007, PODC '07.

[56]  Robert Morris,et al.  Non-scalable locks are dangerous , 2012 .

[57]  Barbara Liskov,et al.  The Argus Language and System , 1984, Advanced Course: Distributed Systems.

[58]  Luís E. T. Rodrigues,et al.  SCert: Speculative certification in replicated software transactional memories , 2011, SYSTOR '11.

[59]  Rachid Guerraoui,et al.  Toward a theory of transactional contention managers , 2005, PODC '05.

[60]  Rachid Guerraoui,et al.  Why STM can be more than a research toy , 2011, Commun. ACM.

[61]  Jim Gray,et al.  Notes on Data Base Operating Systems , 1978, Advanced Course: Operating Systems.

[62]  Maurice Herlihy,et al.  Transactional Memory: Architectural Support For Lock-free Data Structures , 1993, Proceedings of the 20th Annual International Symposium on Computer Architecture.

[63]  James R. Larus,et al.  Transactional Memory (Synthesis Lectures on Computer Architecture) , 2007 .

[64]  Rachid Guerraoui,et al.  Elastic transactions , 2017, J. Parallel Distributed Comput..

[65]  Maurice Herlihy,et al.  Software transactional memory for dynamic-sized data structures , 2003, PODC '03.

[66]  Dan Pritchett,et al.  BASE: An Acid Alternative , 2008, ACM Queue.

[67]  Maurice Herlihy,et al.  A flexible framework for implementing software transactional memory , 2006, OOPSLA '06.

[68]  Pradeep Dubey,et al.  PALM: Parallel Architecture-Friendly Latch-Free Modifications to B+ Trees on Many-Core Processors , 2011, Proc. VLDB Endow..

[69]  Michael L. Scott,et al.  Software cache coherence for large scale multiprocessors , 1995, Proceedings of 1995 1st IEEE Symposium on High Performance Computer Architecture.

[70]  Ye Sun,et al.  Distributed Transactional Memory for Metric-Space Networks , 2005, DISC.

[71]  Christof Fetzer,et al.  Extensible transactional memory testbed , 2010, J. Parallel Distributed Comput..

[72]  Sanjay J. Patel,et al.  Cohesion: a hybrid memory model for accelerators , 2010, ISCA.

[73]  Bradford L. Chamberlain,et al.  Software transactional memory for large scale clusters , 2008, PPoPP.

[74]  Vincent Gramoli,et al.  More than you ever wanted to know about synchronization: synchrobench, measuring the impact of the synchronization on concurrent algorithms , 2015, PPoPP.