Modeling software design diversity: a review

Design diversity has been used for many years now as a means of achieving a degree of fault tolerance in software-based systems. While there is clear evidence that the approach can be expected to deliver some increase in reliability compared to a single version, there is no agreement about the extent of this. More importantly, it remains difficult to evaluate exactly how reliable a particular diverse fault-tolerant system is. This difficulty arises because assumptions of independence of failures between different versions have been shown to be untenable: assessment of the actual level of dependence present is therefore needed, and this is difficult. In this tutorial, we survey the modeling issues here, with an emphasis upon the impact these have upon the problem of assessing the reliability of fault-tolerant systems. The intended audience is one of designers, assessors, and project managers with only a basic knowledge of probabilities, as well as reliability experts without detailed knowledge of software, who seek an introduction to the probabilistic issues in decisions about design diversity.

[1]  Pascal Traverse AIRBUS and ATR System Architecture and Specification , 1988 .

[2]  Jean Arlat,et al.  Definition and analysis of hardware- and software-fault-tolerant architectures , 1990, Computer.

[3]  Martin L. Shooman Avionics software problem occurrence rates , 1996, Proceedings of ISSRE '96: 7th International Symposium on Software Reliability Engineering.

[4]  Bev Littlewood,et al.  Conceptual Modeling of Coincident Failures in Multiversion Software , 1989, IEEE Trans. Software Eng..

[5]  Lorenzo Strigini,et al.  Conceptual models for the reliability of diverse systems-new results , 1998, Digest of Papers. Twenty-Eighth Annual International Symposium on Fault-Tolerant Computing (Cat. No.98CB36224).

[6]  G. E. Migneault The cost of software fault tolerance , 1982 .

[7]  Nancy G. Leveson,et al.  A reply to the criticisms of the Knight & Leveson experiment , 1990, SOEN.

[8]  Ying C. Yeh Design considerations in Boeing 777 fly-by-wire computers , 1998, Proceedings Third IEEE International High-Assurance Systems Engineering Symposium (Cat. No.98EX231).

[9]  共立出版株式会社 コンピュータ・サイエンス : ACM computing surveys , 1978 .

[10]  Paul Ammann,et al.  An experimental evaluation of simple methods for seeding program errors , 1985, ICSE '85.

[11]  Lorenzo Strigini,et al.  A contribution to the evaluation of the reliability of iterative‐execution software † , 1999 .

[12]  Francesca Saglietti,et al.  Software Fault Tolerance: Achievement and Assessment Strategies , 1992 .

[13]  Hoyt Lougee,et al.  SOFTWARE CONSIDERATIONS IN AIRBORNE SYSTEMS AND EQUIPMENT CERTIFICATION , 2001 .

[14]  Peter G. Bishop,et al.  PODS revisited-a study of software failure behaviour , 1988, [1988] The Eighteenth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[15]  Algirdas Avizienis,et al.  Software Fault Tolerance , 1989, IFIP Congress.

[16]  Johan F. Lindeberg,et al.  The Swedish State Railways’ experience with n-version programmed systems , 1993 .

[17]  Dave E. Eckhardt,et al.  A Theoretical Basis for the Analysis of Multiversion Software Subject to Coincident Errors , 1985, IEEE Transactions on Software Engineering.

[18]  Bev Littlewood,et al.  A note on reliability estimation of functionally diverse systems , 1999 .

[19]  Giorgio Mongardi DEPENDABLE COMPUTING FOR RAILWAY CONTROL SYSTEMS , 1993 .

[20]  D. N. Wall,et al.  Darts - An Experiment Into Cost of and Diversity in Safety Critical Computer Systems , 1991 .

[21]  LittlewoodBev,et al.  Modeling software design diversity , 2001 .

[22]  L. Gmeiner,et al.  Software Diversity in Reactor Protection Systems: An Experment , 1979 .

[23]  Lorenzo Strigini,et al.  A Contribution to the Evaluation of the Reliability of Iterative-Execution Software , 1999, Softw. Test. Verification Reliab..

[24]  Victor F. Nicola,et al.  Modeling of Correlated Failures and Community Error Recovery in Multiversion Software , 1990, IEEE Trans. Software Eng..

[25]  John D. Musa,et al.  Operational profiles in software-reliability engineering , 1993, IEEE Software.

[26]  Edward N. Adams,et al.  Optimizing Preventive Service of Software Products , 1984, IBM J. Res. Dev..

[27]  Manfred Kersken,et al.  Software Fault Tolerance , 1992, Research Reports ESPRIT.

[28]  Heinz Trier Centre for software reliability , 1994 .

[29]  Paul Ammann,et al.  Data Diversity: An Approach to Software Fault Tolerance , 1988, IEEE Trans. Computers.

[30]  Bev Littlewood,et al.  Modeling the Effects of Combining Diverse Software Fault Detection Techniques , 2000, IEEE Trans. Software Eng..

[31]  Bev Littlewood,et al.  Validation of ultrahigh dependability for software-based systems , 1993, CACM.

[32]  Yennun Huang,et al.  Software rejuvenation: analysis, module and applications , 1995, Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[33]  Lorenzo Strigini On Testing Process Control Software for Reliability Assessment: the Effects of Correlation between Successive Failures , 1996, Softw. Test. Verification Reliab..

[34]  H. Hecht,et al.  Designing micro-based systems for fail-safe travel: For reliable control of railroads, aircraft, and space vehicles, designers are harnessing the power of the microprocessor , 1987, IEEE Spectrum.

[35]  T. Anderson,et al.  An Evaluation of Software Fault Tolerance in a Practical System , 1995, Twenty-Fifth International Symposium on Fault-Tolerant Computing, 1995, ' Highlights from Twenty-Five Years'..

[36]  Bev Littlewood,et al.  Modelling the effects of combining diverse software fault removal techniques , 1999 .

[37]  Lorenzo Strigini,et al.  Adjudicators for diverse-redundant components , 1990, Proceedings Ninth Symposium on Reliable Distributed Systems.

[38]  Douglas M. Blough,et al.  A comparison of voting strategies for fault-tolerant distributed systems , 1990, Proceedings Ninth Symposium on Reliable Distributed Systems.

[39]  J. Meigs,et al.  WHO Technical Report , 1954, The Yale Journal of Biology and Medicine.

[40]  Uma Ferrell,et al.  RTCA DO-178B/EUROCAE ED-12B , 2000, Avionics.

[41]  Heinz Kantz,et al.  The ELEKTRA railway signalling system: field experience with an actively replicated system with diversity , 1995, Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[42]  Peter G. Bishop The PODS Diversity Experiment , 1988 .

[43]  Michael Dyer The Cleanroom Approach to Quality Software Development , 1992, Int. CMG Conference.

[44]  Bev Littlewood The impact of diversity upon common mode failures , 1996 .

[45]  Charles Babbage On the Mathematical Powers of the Calculating Engine , 1982 .

[46]  U. Voges Software Diversity in Computerized Control Systems , 1988, Dependable Computing and Fault-Tolerant Systems.

[47]  Pentti J. Haapanen,et al.  Advanced control and instrumentation systems in nuclear power plants design, verification and validation : IAEA/IWG/ATWR & NPPCI Technical Committee Meeting, Espoo/Helsinki, Finland, 20-23 June 1994 , 1995 .

[48]  Ravishankar K. Iyer,et al.  Software Dependability in the Tandem GUARDIAN System , 1995, IEEE Trans. Software Eng..

[49]  Pascal Traverse,et al.  AIRBUS A320/A330/A340 electrical flight controls - A family of fault-tolerant systems , 1993, FTCS-23 The Twenty-Third International Symposium on Fault-Tolerant Computing.

[50]  Michael R. Lyu,et al.  Handbook of software reliability engineering , 1996 .

[51]  Nancy G. Leveson,et al.  An experimental evaluation of the assumption of independence in multiversion programming , 1986, IEEE Transactions on Software Engineering.

[52]  R. P. Hughes,et al.  A new approach to common cause failure , 1987 .

[53]  Lorenzo Strigini On Testing Process Control Software for Reliability Assessment: the Effects of Correlation between Successive Failures , 1996 .