Verifying Fault-Tolerant Distributed Systems Using Object-Based Graph Grammars

Assuring the correctness of fault-tolerant distributed systems can be an overwhelming task. Besides dealing with complex problems of distributed systems, it is also necessary to design the system in such a way that a well-defined failure behaviour, or the masking of failure components, is presented by the system when components fail. To help reasoning about such systems, the use of formal methods becomes desirable. In previous work we introduced a graphical formal specification language, called Object-Based Graph Grammars (OBGG), for modelling asynchronous distributed systems. We also defined a method for automatically inserting classical fault behaviours into OBGG models. The obtained models could be analysed using simulation. In this paper a new method for automatically inserting fault behaviours into OBGG models, which is suitable for using verification as the analysis method, is proposed. Moreover, we show how to formally verify OBGG models in the presence of such faults. A two phase commit protocol is used to illustrate the contributions.

[1]  T. S. Perraju,et al.  Specifying fault tolerance in mission critical systems , 1996, Proceedings. IEEE High-Assurance Systems Engineering Workshop (Cat. No.96TB100076).

[2]  Felix C. Gärtner,et al.  Fundamentals of fault-tolerant distributed computing in asynchronous environments , 1999, CSUR.

[3]  Marsha Chechik,et al.  Events in Property Patterns , 1999, SPIN.

[4]  Sam Toueg,et al.  Distributed agreement in the presence of processor and communication faults , 1986, IEEE Transactions on Software Engineering.

[5]  Nancy A. Lynch,et al.  Distributed Computing: Models and Methods , 1990, Handbook of Theoretical Computer Science, Volume B: Formal Models and Sematics.

[6]  Fernando Luís Dotti,et al.  Specification of Mobile Code Systems using Graph Grammars , 2000, FMOODS.

[7]  Hartmut Ehrig,et al.  Handbook of graph grammars and computing by graph transformation: vol. 3: concurrency, parallelism, and distribution , 1999 .

[8]  Fernando Luís Dotti,et al.  An Environment for the Development of Concurrent Object-Based Applications , 2005, Electron. Notes Theor. Comput. Sci..

[9]  Sam Toueg,et al.  A Modular Approach to Fault-Tolerant Broadcasts and Related Problems , 1994 .

[10]  Dániel Varró,et al.  Model Checking Graph Transformations: A Comparison of Two Approaches , 2004, ICGT.

[11]  Tatsuhiro Tsuchiya,et al.  Automatic verification of fault tolerance using model checking , 2001, Proceedings 2001 Pacific Rim International Symposium on Dependable Computing.

[12]  Fernando Luís Dotti,et al.  Verifying Object-Based Graph Grammars , 2004, Electron. Notes Theor. Comput. Sci..

[13]  Fernando Luís Dotti,et al.  On the Use of Formal Specifications to Analyze Fault Behaviors of Distributed Systems , 2003, LADC.

[14]  Flaviu Cristian,et al.  Understanding fault-tolerant distributed systems , 1991, CACM.

[15]  Edmund M. Clarke,et al.  Symbolic Model Checking: 10^20 States and Beyond , 1990, Inf. Comput..

[16]  Kenneth P. Birman,et al.  Building Secure and Reliable Network Applications , 1996 .

[17]  Gerard J. Holzmann,et al.  The Model Checker SPIN , 1997, IEEE Trans. Software Eng..

[18]  Grzegorz Rozenberg,et al.  Handbook of Graph Grammars and Computing by Graph Transformations, Volume 1: Foundations , 1997 .

[19]  Fernando Luís Dotti,et al.  Verification of Distributed Object-Based Systems , 2003, FMOODS.

[20]  Pankaj Jalote,et al.  Fault tolerance in distributed systems , 1994 .

[21]  Nancy A. Lynch,et al.  Distributed Algorithms , 1992, Lecture Notes in Computer Science.

[22]  Flaviu Cristian,et al.  A Rigorous Approach to Fault-Tolerant Programming , 1985, IEEE Transactions on Software Engineering.

[23]  Jean-Jacques Lévy,et al.  A Calculus of Mobile Agents , 1996, CONCUR.