Formal Verification of Consensus Algorithms Tolerating Malicious Faults

Consensus is the paradigmatic problem in fault-tolerant distributed computing: it requires network nodes that communicate by message passing to agree on common value even in the presence of (benign or malicious) faults. Several algorithms for solving Consensus exist, but few of them have been rigorously verified, much less so formally. The Heard-Of model proposes a simple, unifying framework for defining distributed algorithms in the presence of communication faults. Algorithms proceed in communication-closed rounds, and assumptions on the faults tolerated by the algorithm are stated abstractly in the form of communication predicates. Extending previous work on the case of benign faults, our approach relies on the fact that properties such as Consensus can be verified over a coarse-grained, round-based representation of executions. We have encoded the Heard-Of model in the interactive proof assistant Isabelle/HOL and have used this encoding to formally verify three Consensus algorithms based on synchronous and asynchronous assumptions. Our proofs give some new insights into the correctness of the algorithms, in particular with respect to transient faults.

[1]  Doron A. Peled,et al.  Stutter-Invariant Temporal Properties are Expressible Without the Next-Time Operator , 1997, Inf. Process. Lett..

[2]  Stephan Merz,et al.  Proving the Correctness of Disk Paxos , 2005, Arch. Formal Proofs.

[3]  Ulrich Schmid,et al.  Formally verified Byzantine agreement in presence of link faults , 2002, Proceedings 22nd International Conference on Distributed Computing Systems.

[4]  Danny Dolev,et al.  Shifting gears: changing algorithms on the fly to expedite Byzantine agreement , 1987, PODC '87.

[5]  Tatsuhiro Tsuchiya,et al.  Model Checking of Consensus Algorit , 2007, 2007 26th IEEE International Symposium on Reliable Distributed Systems (SRDS 2007).

[6]  Nancy A. Lynch,et al.  Consensus in the presence of partial synchrony , 1988, JACM.

[7]  Wim H. Hesselink,et al.  The Verified Incremental Design of a Distributed Spanning Tree Algorithm: Extended Abstract , 1999, Formal Aspects of Computing.

[8]  Tobias Nipkow,et al.  A Proof Assistant for Higher-Order Logic , 2002 .

[9]  Stephan Merz,et al.  Formal Verification of a Consensus Algorithm in the Heard-Of Model , 2009, Int. J. Softw. Informatics.

[10]  Nancy A. Lynch,et al.  Impossibility of distributed consensus with one faulty process , 1985, JACM.

[11]  Nancy A. Lynch,et al.  Automated implementation of complex distributed algorithms specified in the IOA language , 2009, International Journal on Software Tools for Technology Transfer.

[12]  Leslie Lamport,et al.  What Good is Temporal Logic? , 1983, IFIP Congress.

[13]  Tatsuhiro Tsuchiya,et al.  Using Bounded Model Checking to Verify Consensus Algorithms , 2008, DISC.

[14]  Lawrence Charles Paulson,et al.  Isabelle/HOL: A Proof Assistant for Higher-Order Logic , 2002 .

[15]  Tatsuhiro Tsuchiya,et al.  Model Checking of Consensus Algorithms , 2007 .

[16]  Stephan Merz,et al.  Specifying and Verifying Fault-Tolerant Systems , 1994, FTRTFT.

[17]  André Schiper,et al.  The Heard-Of model: computing in distributed systems with benign faults , 2009, Distributed Computing.

[18]  Nissim Francez,et al.  Decomposition of Distributed Programs into Communication-Closed Layers , 1982, Sci. Comput. Program..

[19]  André Schiper,et al.  Tolerating corrupted communication , 2007, PODC '07.

[20]  Stephan Merz,et al.  A Reduction Theorem for the Verification of Round-Based Distributed Algorithms , 2009, RP.

[21]  Leslie Lamport,et al.  Byzantizing Paxos by Refinement , 2011, DISC.

[22]  Joost-Pieter Katoen,et al.  A probabilistic extension of UML statecharts: Specification and Verification. , 2002 .

[23]  Seif Haridi,et al.  Distributed Algorithms , 1992, Lecture Notes in Computer Science.