Accuracy of Message Counting Abstraction in Fault-Tolerant Distributed Algorithms

Fault-tolerant distributed algorithms are a vital part of mission-critical distributed systems. In principle, automatic verification can be used to ensure the absence of bugs in such algorithms. In practice however, model checking tools will only establish the correctness of distributed algorithms if message passing is encoded efficiently. In this paper, we consider abstractions suitable for many fault-tolerant distributed algorithms that count messages for comparison against thresholds, e.g., the size of a majority of processes. Our experience shows that storing only the numbers of sent and received messages in the global state is more efficient than explicitly modeling message buffers or sets of messages. Storing only the numbers is called message-counting abstraction. Intuitively, this abstraction should maintain all necessary information. In this paper, we confirm this intuition for asynchronous systems by showing that the abstract system is bisimilar to the concrete system. Surprisingly, if there are real-time constraints on message delivery (as assumed in fault-tolerant clock synchronization algorithms), then there exist neither timed bisimulation, nor time-abstracting bisimulation. Still, we prove this abstraction useful for model checking: it preserves ATCTL properties, as the abstract and the concrete models simulate each other.

[1]  Parosh Aziz Abdulla,et al.  All for the Price of Few , 2013, VMCAI.

[2]  Kim G. Larsen,et al.  Efficient on-the-fly Algorithm for Checking Alternating Timed Simulation , 2009, FORMATS.

[3]  Karlis Cerans,et al.  Decidability of Bisimulation Equivalences for Parallel Timer Processes , 1992, CAV.

[4]  Sam Toueg,et al.  Asynchronous consensus and broadcast protocols , 1985, JACM.

[5]  Nancy A. Lynch,et al.  Forward and Backward Simulations, II: Timing-Based Systems , 1991, Inf. Comput..

[6]  Rajeev Alur,et al.  A Theory of Timed Automata , 1994, Theor. Comput. Sci..

[7]  Helmut Veith,et al.  Parameterized model checking of rendezvous systems , 2014, Distributed Computing.

[8]  Elena Pagani,et al.  Counting Constraints in Flat Array Fragments , 2016, IJCAR.

[9]  Helmut Veith,et al.  SMT and POR Beat Counter Abstraction: Parameterized Model Checking of Threshold-Based Distributed Algorithms , 2015, CAV.

[10]  Helmut Veith,et al.  A short counterexample property for safety and liveness verification of fault-tolerant distributed algorithms , 2016, POPL.

[11]  Nancy A. Lynch,et al.  Impossibility of distributed consensus with one faulty process , 1983, PODS '83.

[12]  Swen Jacobs,et al.  Tight Cutoffs for Guarded Protocols with Fairness , 2015, VMCAI.

[13]  Achour Mostéfaoui,et al.  Evaluating the condition-based approach to solve consensus , 2003, 2003 International Conference on Dependable Systems and Networks, 2003. Proceedings..

[14]  Parosh Aziz Abdulla,et al.  Model checking of systems with many identical timed processes , 2003, Theor. Comput. Sci..

[15]  Stephan Merz,et al.  Model Checking , 2000 .

[16]  Dana Fisman,et al.  On Verifying Fault Tolerance of Distributed Protocols , 2008, TACAS.

[17]  Wang Yi,et al.  UPPAAL 4.0 , 2006, Third International Conference on the Quantitative Evaluation of Systems - (QEST'06).

[18]  Nancy A. Lynch,et al.  The Theory of Timed I/O Automata (Synthesis Lectures in Computer Science) , 2006 .

[19]  Helmut Veith,et al.  Towards Modeling and Model Checking Fault-Tolerant Distributed Algorithms , 2013, SPIN.

[20]  Sam Toueg,et al.  Optimal clock synchronization , 1985, PODC '85.

[21]  Parosh Aziz Abdulla,et al.  Multi-clock timed networks , 2004, Proceedings of the 19th Annual IEEE Symposium on Logic in Computer Science, 2004..

[22]  Matthias Függer,et al.  Reconciling fault-tolerant distributed computing and systems-on-chip , 2011, Distributed Computing.

[23]  Elena Pagani,et al.  Counter Abstractions in Model Checking of Distributed Broadcast Algorithms: Some Case Studies , 2016, CILC.

[24]  Robbert van Renesse,et al.  Bosco: One-Step Byzantine Asynchronous Consensus , 2008, DISC.

[25]  Ulrich Schmid,et al.  Booting clock synchronization in partially synchronous systems with hybrid process and link failures , 2007, Distributed Computing.

[26]  Helmut Veith,et al.  What You Always Wanted to Know About Model Checking of Fault-Tolerant Distributed Algorithms , 2015, Ershov Memorial Conference.

[27]  Thomas A. Henzinger,et al.  A Logic-Based Framework for Verifying Consensus Algorithms , 2014, VMCAI.

[28]  Luca Spalazzi,et al.  Parameterized Model-Checking of Timed Systems with Conjunctive Guards , 2014, VSTTE.

[29]  Helmut Veith,et al.  Parameterized model checking of fault-tolerant distributed algorithms by abstraction , 2013, 2013 Formal Methods in Computer-Aided Design.

[30]  Tatsuhiro Tsuchiya,et al.  Verification of consensus algorithms using satisfiability solving , 2011, Distributed Computing.

[31]  Stavros Tripakis,et al.  Analysis of Timed Systems Using Time-Abstracting Bisimulations , 2001, Formal Methods Syst. Des..

[32]  Helmut Veith,et al.  Counterexample-guided abstraction refinement for symbolic model checking , 2003, JACM.

[33]  Benjamin Aminof,et al.  Liveness of Parameterized Timed Networks , 2015, ICALP.

[34]  Ulrich Schmid,et al.  The Theta-Model: achieving synchrony without clocks , 2009, Distributed Computing.

[35]  Thomas A. Henzinger,et al.  Symbolic Model Checking for Real-Time Systems , 1994, Inf. Comput..

[36]  Joël Ouaknine,et al.  Model-Checking for Real-Time Systems , 1995, FCT.

[37]  Kedar S. Namjoshi,et al.  Uncovering Symmetries in Irregular Process Networks , 2013, VMCAI.

[38]  Christel Baier,et al.  Principles of model checking , 2008 .

[39]  Helmut Veith,et al.  On the completeness of bounded model checking for threshold-based distributed algorithms: Reachability , 2014, Inf. Comput..

[40]  Nancy A. Lynch,et al.  The Theory of Timed I/o Automata , 2003 .

[41]  Sam Toueg,et al.  Simulating authenticated broadcasts to derive simple fault-tolerant algorithms , 1987, Distributed Computing.

[42]  Leslie Lamport,et al.  Reaching Agreement in the Presence of Faults , 1980, JACM.