FTSD: a fissionable lock for multicores

Delegation is the highly efficient solution for parallel synchronization. However, the existing the-state-of-the-art delegation locks offer good performance at the cost of occupying computing cores under moderate contention, but exhibit sub-optimal single-thread performance and non-scalable performance under no and high contention, respectively. In this paper, we present a fissionable lock, called FTSD, which consists of two underlying locks: a TTS lock for lock stealing, which serves as a fast path, and a NUMA-aware delegation lock, which offers scalable performance under high contention. Our evaluation shows that FTSD delivers as good or better performance than other state-of-the-art locks.

[1]  David Dice,et al.  Malthusian Locks , 2015, EuroSys.

[2]  Tudor David,et al.  Everything you always wanted to know about synchronization but were afraid to ask , 2013, SOSP.

[3]  Stijn Eyerman,et al.  Modeling critical sections in Amdahl's law and its implications for multicore design , 2010, ISCA '10.

[4]  John M. Mellor-Crummey,et al.  High performance locks for multi-level NUMA systems , 2015, PPoPP.

[5]  Traviss. Craig,et al.  Building FIFO and Priority-Queuing Spin Locks from Atomic Swap , 1993 .

[6]  ZhengMing Yi,et al.  A scalable lock on NUMA multicore , 2020, Concurr. Comput. Pract. Exp..

[7]  Julia L. Lawall,et al.  Fast and Portable Locking for Multicore Architectures , 2016, ACM Trans. Comput. Syst..

[8]  Panagiota Fatourou,et al.  Revisiting the combining synchronization technique , 2012, PPoPP '12.

[9]  Nir Shavit,et al.  A Hierarchical CLH Queue Lock , 2006, Euro-Par.

[10]  Robert Tappan Morris,et al.  An Analysis of Linux Scalability to Many Cores , 2010, OSDI.

[11]  Taesoo Kim,et al.  Scalable and practical locking with shuffling , 2019, SOSP.

[12]  Vivien Quéma,et al.  Multicore Locks: The Case Is Not Closed Yet , 2016, USENIX Annual Technical Conference.

[13]  Michael L. Scott,et al.  Algorithms for scalable synchronization on shared-memory multiprocessors , 1991, TOCS.

[14]  Konstantinos Sagonas,et al.  Queue Delegation Locking , 2014, IEEE Transactions on Parallel and Distributed Systems.

[15]  Jakob Eriksson,et al.  ffwd: delegation is (much) faster than you think , 2017, SOSP.

[16]  Haibo Chen,et al.  Scalable Adaptive NUMA-Aware Lock , 2017, IEEE Transactions on Parallel and Distributed Systems.

[17]  Nir Shavit,et al.  Flat-combining NUMA locks , 2011, SPAA '11.

[18]  Edsger W. Dijkstra,et al.  Solution of a problem in concurrent programming control , 1965, CACM.

[19]  Changwoo Min,et al.  Scalable NUMA-aware Blocking Synchronization Primitives , 2017, USENIX Annual Technical Conference.

[20]  Shasha Wen,et al.  An Efficient Abortable-locking Protocol for Multi-level NUMA Systems , 2017, PPoPP.

[21]  Thomas E. Anderson,et al.  The Performance of Spin Lock Alternatives for Shared-Memory Multiprocessors , 1990, IEEE Trans. Parallel Distributed Syst..

[22]  Erik Hagersten,et al.  Hierarchical backoff locks for nonuniform communication architectures , 2003, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings..

[23]  Erik Hagersten,et al.  Queue locks on cache coherent multiprocessors , 1994, Proceedings of 8th International Parallel Processing Symposium.

[24]  Robert Morris,et al.  Non-scalable locks are dangerous , 2012 .

[25]  William N. Scherer,et al.  Preemption Adaptivity in Time-Published Queue-Based Spin Locks , 2005, HiPC.

[26]  Julia L. Lawall,et al.  Remote Core Locking: Migrating Critical-Section Execution to Improve the Performance of Multithreaded Applications , 2012, USENIX Annual Technical Conference.

[27]  David Dice,et al.  Compact NUMA-aware Locks , 2018, EuroSys.

[28]  Nir Shavit,et al.  Flat combining and the synchronization-parallelism tradeoff , 2010, SPAA '10.

[29]  Christian Bienia,et al.  Benchmarking modern multiprocessors , 2011 .

[30]  Wenguang Chen,et al.  pLock: A Fast Lock for Architectures with Explicit Inter-core Message Passing , 2019, ASPLOS.

[31]  Virendra J. Marathe,et al.  Lock cohorting: a general technique for designing NUMA locks , 2012, PPoPP '12.

[32]  John M. Mellor-Crummey,et al.  Contention-conscious, locality-preserving locks , 2016, PPoPP.