Verification of the randomized consensus algorithm of Aspnes and Herlihy: a case study

The Probabilistic I/O Automaton model of [Seg95] is used as the basis for a formal presentation and proof of the randomized consensus algorithm of Aspnes and Herlihy. The algorithm guarantees termination within expected polynomial time. The Aspnes-Herlihy algorithm is a rather complex algorithm. Processes move through a succession of asynchronous rounds, attempting to agree at each round. At each round, the agreement attempt involves a distributed random walk. The algorithm is hard to analyze because of its use of nontrivial results of probability theory (specifically, random walk theory), because of its complex setting, including asynchrony and both nondeterministic and probabilistic choice, and because of the interplay among several different sub-protocols. We formalize the Aspnes-Herlihy algorithm using probabilistic I/O automata. In doing so, we decompose it formally into three subprotocols: one to carry out the agreement attempts, one to conduct the random walks, and one to implement a shared counter needed by the random walks. Properties of all three subprotocols are proved separately, and combined using general results about automaton composition. It turns out that most of the work involves proving non-probabilistic properties (invariants, simulation mappings, non-probabilistic progress properties, etc.). The probabilistic reasoning is isolated to a few small sections of the proof. The task of carrying out this proof has led us to develop several general proof techniques for probabilistic I/O automata. These include ways to combine expectations for different complexity measures, to compose expected complexity properties, to convert probabilistic claims to deterministic claims, to use abstraction mappings to prove probabilistic properties, and to apply random walk theory in a distributed computational setting. We apply all of these techniques to analyze the expected complexity of the algorithm.

[1]  Maurice Herlihy,et al.  Time-Lapse Snapshots , 1992, SIAM J. Comput..

[2]  Shay Kutten,et al.  Time Optimal Self-Stabilizing Spanning Tree Algorithms , 1993, FSTTCS.

[3]  F. Vaandrager Forward and Backward Simulations Part I : Untimed Systems , 1993 .

[4]  Amos Israeli,et al.  Analyzing Expected Time by Scheduler-Luck Games , 1995, IEEE Trans. Software Eng..

[5]  Feller William,et al.  An Introduction To Probability Theory And Its Applications , 1950 .

[6]  Karl R. Abrahamson On achieving consensus using a shared memory , 1988, PODC '88.

[7]  Nancy A. Lynch,et al.  Forward and backward simulations, part II: timing-based systems , 1993 .

[8]  Hans A. Hansson Time and probability in formal design of distributed systems , 1991, DoCS.

[9]  Roberto Segala,et al.  Modeling and verification of randomized distributed real-time systems , 1996 .

[10]  Yonatan Aumann,et al.  Efficient asynchronous consensus with the weak adversary scheduler , 1997, PODC '97.

[11]  Nancy A. Lynch,et al.  Proving time bounds for randomized distributed algorithms , 1994, PODC '94.

[12]  Nancy A. Lynch,et al.  Forward and Backward Simulations, II: Timing-Based Systems , 1991, Inf. Comput..

[13]  Annabelle McIver,et al.  Probabilistic predicate transformers , 1996, TOPL.

[14]  Daniel Lehmann,et al.  On the advantages of free choice: a symmetric and fully distributed solution to the dining philosophers problem , 1981, POPL '81.

[15]  Nancy A. Lynch,et al.  Impossibility of distributed consensus with one faulty process , 1985, JACM.

[16]  Tushar Deepak Chandra Polylog randomized wait-free consensus , 1996, PODC '96.

[17]  Cyrus Derman,et al.  Finite State Markovian Decision Processes , 1970 .

[18]  Roberto Segala,et al.  Formal verification of timed properties of randomized distributed algorithms , 1995, PODC '95.

[19]  Moshe Y. Vardi Automatic verification of probabilistic concurrent finite state programs , 1985, 26th Annual Symposium on Foundations of Computer Science (sfcs 1985).

[20]  Maurice Herlihy,et al.  Fast Randomized Consensus Using Shared Memory , 1990, J. Algorithms.

[21]  Nancy A. Lynch,et al.  Forward and Backward Simulations: I. Untimed Systems , 1995, Inf. Comput..

[22]  Andrea Bianco,et al.  Model Checking of Probabalistic and Nondeterministic Systems , 1995, FSTTCS.

[23]  J. Aspnes Time-and Space-eecient Randomized Consensus , 1992 .

[24]  Micha Sharir,et al.  Termination of Probabilistic Concurrent Program , 1983, TOPL.

[25]  James Aspnes,et al.  Randomized Consensus in Expected O(n log² n) Operations Per Processor , 1996, SIAM J. Comput..

[26]  Nancy A. Lynch,et al.  Hierarchical correctness proofs for distributed algorithms , 1987, PODC '87.

[27]  James Aspnes Time- and Space-Efficient Randomized Consensus , 1993, J. Algorithms.

[28]  Nancy A. Lynch,et al.  Liveness in Timed and Untimed Systems , 1994, Inf. Comput..

[29]  Leslie Lamport,et al.  Concurrent reading and writing , 1977, Commun. ACM.

[30]  Nancy A. Lynch,et al.  Distributed Algorithms , 1992, Lecture Notes in Computer Science.

[31]  Ophir Rachman,et al.  Randomized Consensus in Expected O(n²log n) Operations , 1991, WDAG.