Assessment of the Reliability of Fault-Tolerant Software: A Bayesian Approach

Fault tolerant systems based on the use of software design diversity may be able to achieve high levels of reliability more cost-effectively than other approaches, such as heroic debugging. Earlier experiments have shown multi-version software systems to be more reliable than the individual versions. However, it is also clear that the reliability benefits are much worse than would be suggested by naive assumptions of failure independence between the versions. It follows that it is necessary to assess the reliability actually achieved in a fault tolerant system. The difficulty here mainly lies in acquiring knowledge of the degree of dependence between the failures processes of the versions. The paper addresses the problem using Byesian inference. In particular, it considers the problem of choosing a prior distribution to represent the beliefs of an expert assessor. It is shown that this is not easy, and some pitfalls for the unwary are identified.

[1]  Jeffrey M. Voas,et al.  Estimating the Probability of Failure When Testing Reveals No Failures , 1992, IEEE Trans. Software Eng..

[2]  Heinz Kantz,et al.  The ELEKTRA railway signalling system: field experience with an actively replicated system with diversity , 1995, Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[3]  Pascal Traverse,et al.  AIRBUS A320/A330/A340 electrical flight controls - A family of fault-tolerant systems , 1993, FTCS-23 The Twenty-Third International Symposium on Fault-Tolerant Computing.

[4]  Bev Littlewood,et al.  Conceptual Modeling of Coincident Failures in Multiversion Software , 1989, IEEE Trans. Software Eng..

[5]  W. R. Buckland,et al.  Distributions in Statistics: Continuous Multivariate Distributions , 1973 .

[6]  David Wright,et al.  Some Conservative Stopping Rules for the Operational Testing of Safety-Critical Software , 1997, IEEE Trans. Software Eng..

[7]  Dave E. Eckhardt,et al.  A Theoretical Basis for the Analysis of Multiversion Software Subject to Coincident Errors , 1985, IEEE Transactions on Software Engineering.

[8]  Bev Littlewood,et al.  Validation of ultrahigh dependability for software-based systems , 1993, CACM.

[9]  John D. Musa,et al.  Software reliability - measurement, prediction, application , 1987, McGraw-Hill series in software engineering and technology.

[10]  U. Voges Software Diversity in Computerized Control Systems , 1988, Dependable Computing and Fault-Tolerant Systems.

[11]  Ricky W. Butler,et al.  The infeasibility of experimental quantification of life-critical software reliability , 1991 .

[12]  John D. Musa,et al.  Software reliability measurement , 1984, J. Syst. Softw..

[13]  Sarah Brocklehurst,et al.  Recalibrating Software Reliability Models , 1990, IEEE Trans. Software Eng..

[14]  Nancy G. Leveson,et al.  An experimental evaluation of the assumption of independence in multiversion programming , 1986, IEEE Transactions on Software Engineering.

[15]  W. R. Buckland,et al.  Distributions in Statistics: Continuous Multivariate Distributions , 1973 .

[16]  W. R. Buckland,et al.  Distributions in Statistics: Continuous Multivariate Distributions , 1974 .