60 Years of Mastering Concurrent Computing through Sequential Thinking

Modern computing systems are highly concurrent. Threads run concurrently in shared-memory multi-core systems, and programs run in different servers communicating by sending messages to each other. Concurrent programming is hard because it requires to cope with many possible, unpredictable behaviors of the processes, and the communication media. The article argues that right from the start in 1960's, the main way of dealing with concurrency has been by reduction to sequential reasoning. It traces this history, and illustrates it through several examples, from early ideas based on mutual exclusion (which was initially introduced to access shared physical resources), passing through consensus and concurrent objects (which are immaterial data), until today distributed ledgers. A discussion is also presented, which addresses the limits that this approach encounters, related to fault-tolerance, performance, and inherently concurrent problems.

[1]  Leslie Lamport,et al.  The part-time parliament , 1998, TOCS.

[2]  Leslie Lamport,et al.  On Interprocess Communication-Part I: Basic Formalism, Part II: Algorithms , 2016 .

[3]  Edsger W. Dijkstra,et al.  Hierarchical ordering of sequential processes , 1971, Acta Informatica.

[4]  D. M. Hutton,et al.  The Art of Multiprocessor Programming , 2008 .

[5]  João P. Cachopo,et al.  Versioned boxes as the basis for memory transactions , 2006, Sci. Comput. Program..

[6]  Nancy A. Lynch,et al.  Consensus in the presence of partial synchrony , 1988, JACM.

[7]  David Lorge Parnas,et al.  Concurrent control with “readers” and “writers” , 1971, CACM.

[8]  Michel Raynal,et al.  Mastering Concurrent Computing Through Sequential Thinking: A Half-century Evolution , 2018, ArXiv.

[9]  J. Goldberg,et al.  SIFT: Design and analysis of a fault-tolerant computer for aircraft control , 1978, Proceedings of the IEEE.

[10]  Panagiota Fatourou,et al.  Highly-Efficient Wait-Free Synchronization , 2013, Theory of Computing Systems.

[11]  Michael Ben-Or,et al.  Another advantage of free choice (Extended Abstract): Completely asynchronous agreement protocols , 1983, PODC '83.

[12]  Sam Toueg,et al.  Unreliable failure detectors for reliable distributed systems , 1996, JACM.

[13]  Michel Raynal,et al.  Distributed Algorithms for Message-Passing Systems , 2013, Springer Berlin Heidelberg.

[14]  Bharat K. Bhargava,et al.  Concurrency Control in Database Systems , 2019, IEEE Trans. Knowl. Data Eng..

[15]  Ajay D. Kshemkalyani,et al.  Distributed Computing: Principles, Algorithms, and Systems , 2008 .

[16]  Nancy A. Lynch,et al.  Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services , 2002, SIGA.

[17]  Christos H. Papadimitriou,et al.  The serializability of concurrent database updates , 1979, JACM.

[18]  Jennifer L. Welch,et al.  Relaxed Data Types as Consistency Conditions , 2017, SSS.

[19]  Fred B. Schneider,et al.  Implementing fault-tolerant services using the state machine approach: a tutorial , 1990, CSUR.

[20]  Butler W. Lampson,et al.  Crash Recovery in a Distributed Data Storage System , 1981 .

[21]  Michel Raynal,et al.  On asymmetric progress conditions , 2010, PODC '10.

[22]  Algirdas Avizienis,et al.  Design of fault-tolerant computers , 1967, AFIPS '67 (Fall).

[23]  Werner Vogels,et al.  Building reliable distributed systems at a worldwide scale demands trade-offs between consistency and availability. , 2022 .

[24]  Ana Sokolova,et al.  Local Linearizability for Concurrent Container-Type Data Structures , 2016, CONCUR.

[25]  Eli Gafni,et al.  Recursion in Distributed Computing , 2010, SSS.

[26]  Michael Stonebraker,et al.  A Distributed Database Version of INGRES , 1977, Berkeley Workshop.

[27]  Michel Raynal,et al.  Distributed Universality , 2016, Algorithmica.

[28]  Maurice Herlihy,et al.  Transactional Memory: Architectural Support For Lock-free Data Structures , 1993, Proceedings of the 20th Annual International Symposium on Computer Architecture.

[29]  Leslie Lamport,et al.  Concurrent reading and writing , 1977, Commun. ACM.

[30]  Edsger W. Dijkstra,et al.  Cooperating sequential processes , 2002 .

[31]  Ralph C. Merkle,et al.  A Digital Signature Based on a Conventional Encryption Function , 1987, CRYPTO.

[32]  Leslie Lamport,et al.  Reaching Agreement in the Presence of Faults , 1980, JACM.

[33]  Michel Raynal,et al.  Algorithms for mutual exclusion , 1986 .

[34]  Satoshi Nakamoto Bitcoin : A Peer-to-Peer Electronic Cash System , 2009 .

[35]  Philip A. Bernstein,et al.  Concurrency Control in Distributed Database Systems , 1986, CSUR.

[36]  Leslie Lamport,et al.  A fast mutual exclusion algorithm , 1987, TOCS.

[37]  Bowen Alpern,et al.  Defining Liveness , 1984, Inf. Process. Lett..

[38]  Nir Shavit Data structures in the multicore age , 2011, CACM.

[39]  Christian Cachin,et al.  State Machine Replication with Byzantine Faults , 2010, Replication.

[40]  Maurice Herlihy,et al.  Obstruction-free synchronization: double-ended queues as an example , 2003, 23rd International Conference on Distributed Computing Systems, 2003. Proceedings..

[41]  Maurice Herlihy,et al.  Wait-free synchronization , 1991, TOPL.

[42]  Nancy A. Lynch,et al.  Perspectives on the CAP Theorem , 2012, Computer.

[43]  Marcos K. Aguilera,et al.  Consistency-based service level agreements for cloud storage , 2013, SOSP.

[44]  Sam Toueg,et al.  A Modular Approach to Fault-Tolerant Broadcasts and Related Problems , 1994 .

[45]  Hagit Attiya,et al.  Distributed Computing: Fundamentals, Simulations and Advanced Topics , 1998 .

[46]  Gadi Taubenfeld Contention-sensitive data structures and algorithms , 2017, Theor. Comput. Sci..

[47]  G. S. Graham A New Solution of Dijkstra ' s Concurrent Programming Problem , 2022 .

[48]  Edsger W. Dijkstra,et al.  Solution of a problem in concurrent programming control , 1965, CACM.

[49]  Per Brinch Hansen,et al.  The nucleus of a multiprogramming system , 1970, CACM.

[50]  Ajay D. Kshemkalyani,et al.  Distributed Computing: Index , 2008 .

[51]  Leslie Lamport,et al.  The Byzantine Generals Problem , 1982, TOPL.

[52]  Irving L. Traiger,et al.  The notions of consistency and predicate locks in a database system , 1976, CACM.

[53]  André Schiper,et al.  Replication: Theory and Practice , 2010, Replication.

[54]  Marcos K. Aguilera,et al.  How to implement any concurrent data structure for modern servers , 2017, OPSR.

[55]  Hagit Attiya,et al.  Sharing memory robustly in message-passing systems , 1990, PODC '90.

[56]  Danny Dolev,et al.  On the minimal synchronism needed for distributed consensus , 1983, 24th Annual Symposium on Foundations of Computer Science (sfcs 1983).

[57]  Nir Shavit,et al.  Transactional Locking II , 2006, DISC.

[58]  Maurice Herlihy,et al.  Linearizability: a correctness condition for concurrent objects , 1990, TOPL.

[59]  Per Brinch Hansen The Origin of Concurrent Programming , 2002, Springer New York.

[60]  Michel Raynal,et al.  Distributed Universal Constructions: a Guided Tour , 2017, Bull. EATCS.

[61]  Sam Toueg,et al.  The weakest failure detector for solving consensus , 1992, PODC '92.

[62]  Gadi Taubenfeld,et al.  The Computational Structure of Progress Conditions , 2010, DISC.

[63]  M. Yourcenar Mémoires d'Hadrien , 1951 .

[64]  Michel Raynal,et al.  Set Agreement and Renaming in the Presence of Contention-Related Crash Failures , 2018, SSS.

[65]  Michel Raynal,et al.  Concurrent Programming: Algorithms, Principles, and Foundations , 2012, Springer Berlin Heidelberg.

[66]  E. A. Akkoyunlu,et al.  Some constraints and tradeoffs in the design of network communications , 1975, SOSP.

[67]  Leslie Lamport,et al.  How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs , 2016, IEEE Transactions on Computers.

[68]  Daniel J. Rosenkrantz,et al.  Concurrency control for database systems , 1976, 17th Annual Symposium on Foundations of Computer Science (sfcs 1976).

[69]  Nancy A. Lynch,et al.  Some perspectives on PODC , 2003, Distributed Computing.

[70]  Nancy A. Lynch,et al.  A Lower Bound for the Time to Assure Interactive Consistency , 1982, Inf. Process. Lett..

[71]  Achour Mostéfaoui,et al.  Conditions on input vectors for consensus solvability in asynchronous distributed systems , 2001, STOC '01.

[72]  Michel Raynal,et al.  Unifying Concurrent Objects and Distributed Tasks , 2018, J. ACM.

[73]  Michael O. Rabin,et al.  Randomized byzantine generals , 1983, 24th Annual Symposium on Foundations of Computer Science (sfcs 1983).

[74]  Nir Shavit,et al.  Software transactional memory , 1995, PODC '95.

[75]  Kishori M. Konwar,et al.  Formalizing and Implementing Distributed Ledger Objects , 2018, SIGA.

[76]  Seif Haridi,et al.  Distributed Algorithms , 1992, Lecture Notes in Computer Science.

[77]  Michel Raynal,et al.  Fault-Tolerant Message-Passing Distributed Systems , 2018, Springer International Publishing.

[78]  Gary L. Peterson,et al.  Concurrent Reading While Writing , 1983, TOPL.

[79]  Marcin Paprzycki,et al.  Distributed Computing: Fundamentals, Simulations and Advanced Topics , 2001, Scalable Comput. Pract. Exp..

[80]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[81]  Achour Mostéfaoui,et al.  Signature-Free Asynchronous Binary Byzantine Consensus with t < n/3, O(n2) Messages, and O(1) Expected Time , 2015, J. ACM.

[82]  Philip A. Bernstein,et al.  An algorithm for concurrency control and recovery in replicated distributed databases , 1984, TODS.

[83]  Gary L. Peterson,et al.  Myths About the Mutual Exclusion Problem , 1981, Inf. Process. Lett..

[84]  Donald E. Knuth,et al.  Additional comments on a problem in concurrent programming control , 1966, CACM.

[85]  Vijay K. Garg Elements of distributed computing , 2002 .

[86]  Michel Raynal,et al.  Power and limits of distributed computing shared memory models , 2013, Theor. Comput. Sci..

[87]  Nancy A. Lynch,et al.  Impossibility of distributed consensus with one faulty process , 1983, PODS '83.

[88]  Gadi Taubenfeld Synchronization Algorithms and Concurrent Programming , 2006 .

[89]  B SchneiderFred Implementing fault-tolerant services using the state machine approach: a tutorial , 1990 .

[90]  Michael Ben-Or,et al.  Another advantage of free choice (Extended Abstract): Completely asynchronous agreement protocols , 1983, PODC '83.