Lattice Agreement in Message Passing Systems

This paper studies the lattice agreement problem and the generalized lattice agreement problem in distributed message passing systems. In the lattice agreement problem, given input values from a lattice, processes have to non-trivially decide output values that lie on a chain. We consider the lattice agreement problem in both synchronous and asynchronous systems. For synchronous lattice agreement, we present two algorithms which run in $\log f$ and $\min \{O(\log^2 h(L)), O(\log^2 f)\}$ rounds, respectively, where $h(L)$ denotes the height of the {\em input sublattice} $L$, $f < n$ is the number of crash failures the system can tolerate, and $n$ is the number of processes in the system. These algorithms have significant better round complexity than previously known algorithms. The algorithm by Attiya et al. \cite{attiya1995atomic} takes $\log n$ synchronous rounds, and the algorithm by Mavronicolasa \cite{mavronicolasabound} takes $\min \{O(h(L)), O(\sqrt{f})\}$ rounds. For asynchronous lattice agreement, we propose an algorithm which has time complexity of $2 \cdot \min \{h(L), f + 1\}$ message delays which improves on the previously known time complexity of $O(n)$ message delays. The generalized lattice agreement problem defined by Faleiro et al in \cite{faleiro2012generalized} is a generalization of the lattice agreement problem where it is applied for the replicated state machine. We propose an algorithm which guarantees liveness when a majority of the processes are correct in asynchronous systems. Our algorithm requires $\min \{O(h(L)), O(f)\}$ units of time in the worst case which is better than $O(n)$ units of time required by the algorithm of Faleiro et al. \cite{faleiro2012generalized}.

[1]  Leslie Lamport,et al.  The part-time parliament , 1998, TOCS.

[2]  M. Mavronicolas A Bound on the Rounds to Reach Lattice Agreement , 2005 .

[3]  Gadi Taubenfeld Synchronization Algorithms and Concurrent Programming , 2006 .

[4]  Marc Shapiro,et al.  Conflict-Free Replicated Data Types , 2011, SSS.

[5]  Fred B. Schneider,et al.  Implementing fault-tolerant services using the state machine approach: a tutorial , 1990, CSUR.

[6]  Maurice Herlihy,et al.  Linearizability: a correctness condition for concurrent objects , 1990, TOPL.

[7]  Hagit Attiya,et al.  Atomic snapshots in O(n log n) operations , 1993, PODC '93.

[8]  Hagit Attiya,et al.  Distributed Computing: Fundamentals, Simulations and Advanced Topics , 1998 .

[9]  Nancy A. Lynch,et al.  Impossibility of distributed consensus with one faulty process , 1985, JACM.

[10]  Ophir Rachman,et al.  Atomic snapshots using lattice agreement , 1995, Distributed Computing.

[11]  Danny Dolev,et al.  Authenticated Algorithms for Byzantine Agreement , 1983, SIAM J. Comput..

[12]  Andrew S. Tanenbaum,et al.  Distributed systems: Principles and Paradigms , 2001 .

[13]  Leslie Lamport,et al.  Paxos Made Simple , 2001 .

[14]  Lukas Furst Concurrent Programming Algorithms Principles And Foundations , 2016 .

[15]  Carole Delporte-Gallet,et al.  Implementing Snapshot Objects on Top of Crash-Prone Asynchronous Message-Passing Systems , 2018, IEEE Transactions on Parallel and Distributed Systems.

[16]  Marc Shapiro,et al.  Convergent and Commutative Replicated Data Types , 2011, Bull. EATCS.

[17]  Sriram K. Rajamani,et al.  Generalized lattice agreement , 2012, PODC '12.

[18]  Nir Shavit,et al.  Atomic snapshots of shared memory , 1990, JACM.