Efficient model checking of fault-tolerant distributed protocols

To aid the formal verification of fault-tolerant distributed protocols, we propose an approach that significantly reduces the costs of their model checking. These protocols often specify atomic, process-local events that consume a set of messages, change the state of a process, and send zero or more messages. We call such events quorum transitions and leverage them to optimize state exploration in two ways. First, we generate fewer states compared to models where quorum transitions are expressed by single-message transitions. Second, we refine transitions into a set of equivalent, finer-grained transitions that allow partial-order algorithms to achieve better reduction. We implement the MP-Basset model checker, which supports refined quorum transitions. We model check protocols representing core primitives of deployed reliable distributed systems, namely: Paxos consensus, regular storage, and Byzantine-tolerant multicast. We achieve up to 92% memory and 85% time reduction compared to model checking with standard unrefined single-message transitions.

[1]  Piotr Zielinski,et al.  Automatic Verification and Discovery of Byzantine Consensus Protocols , 2007, 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'07).

[2]  Alan J. Hu,et al.  Protocol verification as a hardware design aid , 1992, Proceedings 1992 IEEE International Conference on Computer Design: VLSI in Computers & Processors.

[3]  Maria Sorea,et al.  Model checking a fault-tolerant startup algorithm: from design exploration to exhaustive fault simulation , 2004, International Conference on Dependable Systems and Networks, 2004.

[4]  Darko Marinov,et al.  A Framework for State-Space Exploration of Java-Based Actor Programs , 2009, 2009 IEEE/ACM International Conference on Automated Software Engineering.

[5]  Miguel Oom Temudo de Castro,et al.  Practical Byzantine fault tolerance , 1999, OSDI '99.

[6]  Hagit Attiya,et al.  Sharing memory robustly in message-passing systems , 1990, PODC '90.

[7]  Gul A. Agha,et al.  Abstracting Interactions Based on Message Sets , 1994, ECOOP Workshop.

[8]  Darko Marinov,et al.  Evaluating Ordering Heuristics for Dynamic Partial-Order Reduction Techniques , 2010, FASE.

[9]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[10]  Cyrille Artho,et al.  Efficient Model Checking of Networked Applications , 2008, TOOLS.

[11]  A. Prasad Sistla,et al.  Symmetry and reduced symmetry in model checking , 2001, TOPL.

[12]  Stephan Merz,et al.  A Reduction Theorem for the Verification of Round-Based Distributed Algorithms , 2009, RP.

[13]  Leslie Lamport,et al.  Specifying Systems: The TLA+ Language and Tools for Hardware and Software Engineers [Book Review] , 2002, Computer.

[14]  Gul A. Agha,et al.  Actor frameworks for the JVM platform: a comparative analysis , 2009, PPPJ '09.

[15]  David L. Dill,et al.  Better verification through symmetry , 1996, Formal Methods Syst. Des..

[16]  Ganesh Gopalakrishnan,et al.  Exploiting Symmetry and Transactions for Partial Order Reduction of Rule Based Specifications , 2006, SPIN.

[17]  Michael K. Reiter,et al.  Secure agreement protocols: reliable and atomic group multicast in rampart , 1994, CCS '94.

[18]  Stephan Merz,et al.  Model Checking , 2000 .

[19]  Yu Yang,et al.  Efficient Stateful Dynamic Partial Order Reduction , 2008, SPIN.

[20]  Patrice Godefroid,et al.  Partial-Order Methods for the Verification of Concurrent Systems , 1996, Lecture Notes in Computer Science.

[21]  A. Prasad Sistla,et al.  SMC: a symmetry-based model checker for verification of safety and liveness properties , 2000, TSEM.

[22]  Leslie Lamport Checking a Multithreaded Algorithm with +CAL , 2006, DISC.

[23]  Dongho Kim,et al.  Design, Deployment, and Use of the DETER Testbed , 2007, DETER.

[24]  Carolyn L. Talcott,et al.  A foundation for actor computation , 1997, Journal of Functional Programming.

[25]  Neeraj Suri,et al.  A Tunable Add-On Diagnostic Protocol for Time-Triggered Systems , 2007, 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'07).

[26]  Koushik Sen,et al.  Automated Systematic Testing of Open Distributed Programs , 2006, FASE.

[27]  Tatsuhiro Tsuchiya,et al.  Model Checking of Consensus Algorithms , 2007 .

[28]  Viktor Kuncak,et al.  CrystalBall: Predicting and Preventing Inconsistencies in Deployed Distributed Systems , 2009, NSDI.

[29]  Muffy Calder,et al.  Symmetry in temporal logic model checking , 2006, CSUR.

[30]  Chao Wang,et al.  Monotonic Partial Order Reduction: An Optimal Symbolic Partial Order Reduction Technique , 2009, CAV.

[31]  Tatsuhiro Tsuchiya,et al.  Using Bounded Model Checking to Verify Consensus Algorithms , 2008, DISC.

[32]  Somesh Jha,et al.  Exploiting symmetry in temporal logic model checking , 1993, Formal Methods Syst. Des..

[33]  Oscar Nierstrasz,et al.  Object-Based Models and Languages for Concurrent Systems , 1994, Lecture Notes in Computer Science.

[34]  Antti Valmari,et al.  The State Explosion Problem , 1996, Petri Nets.

[35]  Kenneth P. Birman,et al.  Reliable Distributed Systems: Technologies, Web Services, and Applications , 2005 .

[36]  Neeraj Suri,et al.  Role-Based Symmetry Reduction of Fault-Tolerant Distributed Protocols with Language Support , 2009, ICFEM.

[37]  Patrice Godefroid,et al.  Dynamic partial-order reduction for model checking software , 2005, POPL '05.

[38]  Eran Yahav,et al.  Cartesian Partial-Order Reduction , 2007, SPIN.

[39]  Gerard J. Holzmann,et al.  The SPIN Model Checker - primer and reference manual , 2003 .

[40]  Brett D. Fleisch,et al.  The Chubby lock service for loosely-coupled distributed systems , 2006, OSDI '06.

[41]  Neeraj Suri,et al.  Application-Level Diagnostic and Membership Protocols for Generic Time-Triggered Systems , 2011, IEEE Transactions on Dependable and Secure Computing.

[42]  Leslie Lamport,et al.  The Byzantine Generals Problem , 1982, TOPL.

[43]  Leslie Lamport,et al.  Paxos Made Simple , 2001 .

[44]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[45]  Haoxiang Lin,et al.  MODIST: Transparent Model Checking of Unmodified Distributed Systems , 2009, NSDI.

[46]  Tatsuhiro Tsuchiya,et al.  Symbolic Model Checking for Self-Stabilizing Algorithms , 2001, IEEE Trans. Parallel Distributed Syst..

[47]  Leslie Lamport,et al.  The part-time parliament , 1998, TOCS.

[48]  Neeraj Suri,et al.  On Efficient Models for Model Checking Message-Passing Distributed Protocols , 2010, FMOODS/FORTE.

[49]  Paulo Veríssimo,et al.  Distributed Systems for System Architects , 2001, Advances in Distributed Computing and Middleware.