A Logic-Based Framework for Verifying Consensus Algorithms

Fault-tolerant distributed algorithms play an important role in ensuring the reliability of many software applications. In this paper we consider distributed algorithms whose computations are organized in rounds. To verify the correctness of such algorithms, we reason about i properties such as invariants of the state, ii the transitions controlled by the algorithm, and iii the communication graph. We introduce a logic that addresses these points, and contains set comprehensions with cardinality constraints, function symbols to describe the local states of each process, and a limited form of quantifier alternation to express the verification conditions. We show its use in automating the verification of consensus algorithms. In particular, we give a semi-decision procedure for the unsatisfiability problem of the logic and identify a decidable fragment. We successfully applied our framework to verify the correctness of a variety of consensus algorithms tolerant to both benign faults message loss, process crashes and value faults message corruption.

[1]  Leslie Lamport,et al.  The part-time parliament , 1998, TOCS.

[2]  Leslie Lamport Distributed algorithms in TLA (abstract) , 2000, PODC '00.

[3]  Ichiro Suzuki,et al.  Proving Properties of a Ring of Finite-State Machines , 1988, Inf. Process. Lett..

[4]  Helmut Veith,et al.  Towards Modeling and Model Checking Fault-Tolerant Distributed Algorithms , 2013, SPIN.

[5]  Brett D. Fleisch,et al.  The Chubby lock service for loosely-coupled distributed systems , 2006, OSDI '06.

[6]  Nicola Santoro,et al.  Time is Not a Healer , 1989, STACS.

[7]  Viktor Kuncak,et al.  An Algorithm for Deciding BAPA: Boolean Algebra with Presburger Arithmetic , 2005, CADE.

[8]  Krzysztof R. Apt,et al.  Limits for Automatic Verification of Finite-State Concurrent Systems , 1986, Inf. Process. Lett..

[9]  Kousha Etessami,et al.  Analysis of Recursive Game Graphs Using Data Flow Equations , 2004, VMCAI.

[10]  Nancy A. Lynch,et al.  Impossibility of distributed consensus with one faulty process , 1985, JACM.

[11]  Stephan Merz,et al.  Formal Verification of a Consensus Algorithm in the Heard-Of Model , 2009, Int. J. Softw. Informatics.

[12]  Ruzica Piskac,et al.  Collections, Cardinalities, and Relations , 2010, VMCAI.

[13]  Rajeev Alur,et al.  A Temporal Logic of Nested Calls and Returns , 2004, TACAS.

[14]  Michael Ben-Or,et al.  Another advantage of free choice (Extended Abstract): Completely asynchronous agreement protocols , 1983, PODC '83.

[15]  Graham Steel,et al.  Deduction with XOR Constraints in Security API Modelling , 2005, CADE.

[16]  Henny B. Sipma,et al.  What's Decidable About Arrays? , 2006, VMCAI.

[17]  André Schiper,et al.  The Heard-Of model: computing in distributed systems with benign faults , 2009, Distributed Computing.

[18]  Achour Mostéfaoui,et al.  Consensus in One Communication Step , 2001, PaCT.

[19]  Parosh Aziz Abdulla,et al.  General decidability theorems for infinite-state systems , 1996, Proceedings 11th Annual IEEE Symposium on Logic in Computer Science.

[20]  Matthias Függer,et al.  Reconciling fault-tolerant distributed computing and systems-on-chip , 2011, Distributed Computing.

[21]  Helmut Veith,et al.  Parameterized model checking of fault-tolerant distributed algorithms by abstraction , 2013, FMCAD 2013.

[22]  Xiaokang Qiu,et al.  Decidable logics combining heap structures and data , 2011, POPL '11.

[23]  André Schiper,et al.  Tolerating corrupted communication , 2007, PODC '07.

[24]  Mario Bravetti,et al.  CONCUR 2009 - Concurrency Theory, 20th International Conference, CONCUR 2009, Bologna, Italy, September 1-4, 2009. Proceedings , 2009, CONCUR.

[25]  Constantin Enea,et al.  A Logic-Based Framework for Reasoning about Composite Data Structures , 2009, CONCUR.

[26]  Victor E. Malyshkin,et al.  Parallel computing technologies , 2011, The Journal of Supercomputing.

[27]  A. Prasad Sistla,et al.  Reasoning about systems with many processes , 1992, JACM.

[28]  Michael Ben-Or,et al.  Another advantage of free choice (Extended Abstract): Completely asynchronous agreement protocols , 1983, PODC '83.

[29]  Mahadev Konar,et al.  ZooKeeper: Wait-free Coordination for Internet-scale Systems , 2010, USENIX ATC.

[30]  Nikolaj Bjørner,et al.  Z3: An Efficient SMT Solver , 2008, TACAS.

[31]  Seif Haridi,et al.  Distributed Algorithms , 1992, Lecture Notes in Computer Science.