Model Checking of Consensus Algorithms

We show for the first time that standard model checking allows one to completely verify asynchronous algorithms for solving consensus, a fundamental problem in fault-tolerant distributed computing. Model checking is a powerful verification methodology based on state exploration. However it has rarely been applied to consensus algorithms, because these algorithms induce huge, often infinite state spaces. Here we focus on consensus algorithms based on the Heard-Of model, a new computation model for distributed computing. By making use of the high abstraction level provided by this computation model and by devising a finite representation of unbounded timestamps, we develop a methodology for verifying consensus algorithms in every possible state by model checking.

[1]  Piotr Zielinski,et al.  Automatic Verification and Discovery of Byzantine Consensus Protocols , 2007, 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'07).

[2]  Sam Toueg,et al.  Unreliable failure detectors for reliable distributed systems , 1996, JACM.

[3]  Nancy A. Lynch,et al.  Using Simulated Execution in Verifying Distributed Algorithms , 2003, VMCAI.

[4]  Leslie Lamport,et al.  Real-Time Model Checking Is Really Simple , 2005, CHARME.

[5]  Péter Urbán,et al.  Solving Agreement Problems with Weak Ordering Oracles , 2002, EDCC.

[6]  Leslie Lamport,et al.  Disk Paxos , 2003, Distributed Computing.

[7]  Marco Pistore,et al.  NuSMV 2: An OpenSource Tool for Symbolic Model Checking , 2002, CAV.

[8]  André Schiper,et al.  From set membership to group membership: a separation of concerns , 2006, IEEE Transactions on Dependable and Secure Computing.

[9]  Nancy A. Lynch,et al.  Using simulated execution in verifying distributed algorithms , 2003, International Journal on Software Tools for Technology Transfer.

[10]  Eli Gafni,et al.  Round-by-Round Fault Detectors: Unifying Synchrony and Asynchrony (Extended Abstract). , 1998, PODC 1998.

[11]  Nancy A. Lynch,et al.  Consensus in the presence of partial synchrony , 1988, JACM.

[12]  Martijn Hendriks,et al.  Model Checking the Time to Reach Agreement , 2005, FORMATS.

[13]  Tatsuhiro Tsuchiya,et al.  An Automatic Real-Time Analysis of the Time to Reach Consensus , 2007 .

[14]  James C. Corbett,et al.  Evaluating Deadlock Detection Methods for Concurrent Software , 1996, IEEE Trans. Software Eng..

[15]  Nicola Santoro,et al.  Time is Not a Healer , 1989, STACS.

[16]  Nancy A. Lynch,et al.  Bounds on the time to reach agreement in the presence of timing uncertainty , 1991, STOC '91.

[17]  Rachid Guerraoui,et al.  The Generic Consensus Service , 2001, IEEE Trans. Software Eng..

[18]  Kenneth L. McMillan,et al.  Symbolic model checking , 1992 .

[19]  Leslie Lamport,et al.  Fast Paxos , 2006, Distributed Computing.

[20]  André Schiper,et al.  Communication Predicates: A High-Level Abstraction for Coping with Transient and Dynamic Faults , 2007, 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'07).

[21]  Neeraj Suri,et al.  Exploiting Symmetry of Distributed FT Protocols To Ease Model Checking ∗ , 2007 .

[22]  Armin Biere,et al.  Bounded Model Checking Using Satisfiability Solving , 2001, Formal Methods Syst. Des..

[23]  Nancy A. Lynch,et al.  Impossibility of distributed consensus with one faulty process , 1985, JACM.

[24]  André Schiper,et al.  Improving Fast Paxos: being optimistic with no overhead , 2006, 2006 12th Pacific Rim International Symposium on Dependable Computing (PRDC'06).

[25]  Marta Z. Kwiatkowska,et al.  Automated Verification of a Randomized Distributed Consensus Protocol Using Cadence SMV and PRISM , 2001, CAV.

[26]  Roberto Segala,et al.  Verification of the randomized consensus algorithm of Aspnes and Herlihy: a case study , 2000, Distributed Computing.

[27]  Maurice Herlihy,et al.  Fast Randomized Consensus Using Shared Memory , 1990, J. Algorithms.

[28]  Michael Ben-Or,et al.  Another advantage of free choice (Extended Abstract): Completely asynchronous agreement protocols , 1983, PODC '83.

[29]  Achour Mostéfaoui,et al.  Consensus in One Communication Step , 2001, PaCT.

[30]  Leslie Lamport,et al.  The part-time parliament , 1998, TOCS.

[31]  André Schiper,et al.  The Heard-Of model: computing in distributed systems with benign faults , 2009, Distributed Computing.

[32]  Uwe Nestmann,et al.  Modeling Consensus in a Process Calculus , 2003, CONCUR.

[33]  André Schiper,et al.  Harmful dogmas in fault tolerant distributed computing , 2007, SIGA.