Role-Based Symmetry Reduction of Fault-Tolerant Distributed Protocols with Language Support

Fault-tolerant (FT) distributed protocols (such as group membership, consensus, etc.) represent fundamental building blocks for many practical systems, e.g., the Google File System. Not only does one desire rigor in the protocol design but especially in its verification given the complexity and fallibility of manual proofs. The application of model checking (MC) for protocol verification is attractive with its full automation and rich property language. However, being an exhaustive exploration method, its scalable use is very much constrained by the overall number of different system states. We observe that, although FT distributed protocols usually display a very high degree of symmetry which stems from permuting different processes, MC efforts targeting their automated verification often disregard this symmetry. Therefore, we propose to leverage the framework of symmetry reduction and improve on existing applications of it by specifying so called role-based symmetries. Our secondary contribution is to define a high-level description language called FTDP to ease the symmetry aware specification of FT distributed protocols. FTDP supports synchronous as well as asynchronous protocols, a variety of fault types, and the specification of safety and liveness properties. Specifications written in FTDP can directly be analyzed by tools supporting symmetry reduction. We demonstrate the benefit of our approach using the example of well-known and complex distributed FT protocols, specifically Paxos and the Byzantine Generals.

[1]  Marko Vukolic,et al.  Reliable Distributed Storage , 2009, Computer.

[2]  Leslie Lamport,et al.  Specifying Systems: The TLA+ Language and Tools for Hardware and Software Engineers [Book Review] , 2002, Computer.

[3]  Leslie Lamport,et al.  The Byzantine Generals Problem , 1982, TOPL.

[4]  Leslie Lamport,et al.  Paxos Made Simple , 2001 .

[5]  Piotr Zielinski,et al.  Automatic Verification and Discovery of Byzantine Consensus Protocols , 2007, 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'07).

[6]  Maria Sorea,et al.  Model checking a fault-tolerant startup algorithm: from design exploration to exhaustive fault simulation , 2004, International Conference on Dependable Systems and Networks, 2004.

[7]  Leslie Lamport,et al.  The part-time parliament , 1998, TOCS.

[8]  Neeraj Suri,et al.  Brief Announcement: Efficient Model Checking of Fault-Tolerant Distributed Protocols Using Symmetry Reduction , 2009, DISC.

[9]  Tatsuhiro Tsuchiya,et al.  Model Checking of Consensus Algorithms , 2007 .

[10]  Dongho Kim,et al.  Design, Deployment, and Use of the DETER Testbed , 2007, DETER.

[11]  Hagit Attiya,et al.  Sharing memory robustly in message-passing systems , 1990, PODC '90.

[12]  A. Prasad Sistla,et al.  Symmetry and reduced symmetry in model checking , 2001, TOPL.

[13]  Tatsuhiro Tsuchiya,et al.  Symbolic Model Checking for Self-Stabilizing Algorithms , 2001, IEEE Trans. Parallel Distributed Syst..

[14]  Tatsuhiro Tsuchiya,et al.  Using Bounded Model Checking to Verify Consensus Algorithms , 2008, DISC.

[15]  David L. Dill,et al.  Better verification through symmetry , 1996, Formal Methods Syst. Des..

[16]  Peter Alan Lee,et al.  Fault Tolerance , 1990, Dependable Computing and Fault-Tolerant Systems.

[17]  Muffy Calder,et al.  Symmetry in temporal logic model checking , 2006, CSUR.

[18]  Somesh Jha,et al.  Exploiting symmetry in temporal logic model checking , 1993, Formal Methods Syst. Des..

[19]  Miguel Oom Temudo de Castro,et al.  Practical Byzantine fault tolerance , 1999, OSDI '99.

[20]  Dragan Bosnacki,et al.  Symmetric Spin , 2002, International Journal on Software Tools for Technology Transfer.

[21]  Tatsuhiro Tsuchiya,et al.  Model Checking of Consensus Algorit , 2007, 2007 26th IEEE International Symposium on Reliable Distributed Systems (SRDS 2007).

[22]  Stephan Merz,et al.  Model Checking , 2000 .

[23]  Neeraj Suri,et al.  A Tunable Add-On Diagnostic Protocol for Time-Triggered Systems , 2007, 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'07).

[24]  A. Prasad Sistla,et al.  SMC: a symmetry-based model checker for verification of safety and liveness properties , 2000, TSEM.

[25]  Alan J. Hu,et al.  Protocol verification as a hardware design aid , 1992, Proceedings 1992 IEEE International Conference on Computer Design: VLSI in Computers & Processors.

[26]  Leslie Lamport Checking a Multithreaded Algorithm with +CAL , 2006, DISC.