Dependability assessment of GUARDS instances

The generic architectural concepts developed in the European ESPRIT project GUARDS (Generic Upgradable Architecture for Real time Distributed Systems) provide a comprehensive framework from which specific instances can be derived to meet the dependability requirements of various application domains. Three main application domains are considered (railway, nuclear propulsion and space) that correspond to the fields of the three end-user partners of the project. This paper presents the modeling method supporting the assessment of GUARDS instances. The goal is to assist the designers in making objective decisions for defining a specific instance of the generic architecture. After a short summary of the main architectural concepts of GUARDS, the paper describes the major assumptions concerning: i) component types (both hardware and software), ii) fault types, where special attention is paid to potentially correlated faults, and iii) the generic fault tolerance features of GUARDS. The main architectural characteristics of the target instances (one for each application domain) are briefly described. The modeling strategy is summarized and examples of models (stochastic Petri nets) are given. Selected results are then presented and discussed. They exemplify the usefulness of the modeling and evaluation method, in particular in the light of sensitivity analyses with respect to model parameters.

[1]  Jean Arlat,et al.  Definition and analysis of hardware- and software-fault-tolerant architectures , 1990, Computer.

[2]  A. Fantechi,et al.  Formal description and validation for an integrity policy supporting multiple levels of criticality , 1999, Dependable Computing for Critical Applications 7.

[3]  Kishor S. Trivedi,et al.  Decomposition in Reliability Analysis of Fault-Tolerant Systems , 1983, IEEE Transactions on Reliability.

[4]  Kishor S. Trivedi,et al.  A Decomposition Approach for Stochastic Reward Net Models , 1993, Perform. Evaluation.

[5]  Ravishankar K. Iyer,et al.  DEPEND: A Simulation-Based Environment for System Level Dependability Analysis , 1997, IEEE Trans. Computers.

[6]  Jean-Paul Blanquart,et al.  Functional and faulty behavior analysis: some experiments and lessons learnt , 1999, Digest of Papers. Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing (Cat. No.99CB36352).

[7]  P. M. Melliar-Smith,et al.  Formal Specification and Mechanical Verification of SIFT: A Fault-Tolerant Flight Control System , 1982, IEEE Transactions on Computers.

[8]  David Powell,et al.  Distributed fault tolerance: lessons from Delta-4 , 1994, IEEE Micro.

[9]  Dhiraj K. Pradhan,et al.  Fault Injection: A Method for Validating Computer-System Dependability , 1995, Computer.

[10]  Jim Gray,et al.  Why Do Computers Stop and What Can Be Done About It? , 1986, Symposium on Reliability in Distributed Software and Database Systems.

[11]  Boudewijn R. Haverkort,et al.  Approximate Performability and Dependability Analysis Using Generalized Stochastic Petri Nets , 1993, Perform. Evaluation.

[12]  Michael R. Lyu,et al.  System-Level Reliability and Sensitivity Analyses for Three Fault-Tolerant System Architectures , 1995 .

[13]  J. H. Lala,et al.  Architectural principles for safety-critical real-time applications , 1994, Proc. IEEE.

[14]  A. Bondavalli,et al.  Dependability modeling and evaluation of phased mission systems: a DSPN approach , 1999, Dependable Computing for Critical Applications 7.

[15]  Karama Kanoun,et al.  Availability of CAUTRA, a Subset of the French Air Traffic Control System , 1999, IEEE Trans. Computers.

[16]  Jean Arlat,et al.  SURF-2: A program for dependability evaluation of complex hardware and software systems , 1993, FTCS-23 The Twenty-Third International Symposium on Fault-Tolerant Computing.

[17]  William H. Sanders,et al.  Reward Model Solution Methods with Impulse and Rate Rewards: An Algorithm and Numerical Results , 1994, Perform. Evaluation.

[18]  Andy J. Wellings,et al.  GUARDS: A Generic Upgradable Architecture for Real-Time Dependable Systems , 1997, IEEE Trans. Parallel Distributed Syst..

[19]  Miklós Telek,et al.  Computational Restrictions for SPN with Generally Distributed Transition Times , 1994, EDCC.

[20]  Joanne Bechta Dugan,et al.  Dependability assessment using binary decision diagrams (BDDs) , 1995, Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers.