Development of software fault-tolerant applications with Ada95 object-oriented support

Experience has shown that the current software engineering practice is inadequate for producing error-free software. Thus, software fault tolerance (SWFT) must be employed in developing complex safety-critical applications. However, developing applications which are capable of tolerating software errors is a challenging task because the developers have to conquer not only the complexity of the application but also the complexity of fault-tolerance protocols. A middleware which provides SWFT services and establishes a well-defined interface with the application modules will allow the application developer to focus solely on the application complexity. This paper presents such a middleware consisting of reusable SWFT components. It also explores the way these components interface with the application in order to tolerate faults in the application. The paper also reports our experience on using real-time and object-oriented features of the new standard of Ada (Ada95) for implementing the middleware.

[1]  W. Kent Fuchs,et al.  Progressive retry for software error recovery in distributed systems , 1993, FTCS-23 The Twenty-Third International Symposium on Fault-Tolerant Computing.

[2]  Brian Randell,et al.  System structure for software fault tolerance , 1975, IEEE Transactions on Software Engineering.

[3]  K. H. Kim,et al.  Distributed Execution of Recovery Blocks: An Approach for Uniform Treatment of Hardware and Software Faults in Real-Time Applications , 1989, IEEE Trans. Computers.

[4]  Ann T. Tai,et al.  Performability enhancement of fault-tolerant software , 1993 .

[5]  Kam S. Tso,et al.  Ada95 object-oriented and real-time support for development of software fault tolerance reusable components , 1996, Proceedings of WORDS'96. The Second Workshop on Object-Oriented Real-Time Dependable Systems.

[6]  Hermann Kopetz,et al.  Fault tolerance, principles and practice , 1990 .

[7]  Jim Gray,et al.  A census of Tandem system availability between 1985 and 1990 , 1990 .

[8]  K Tso,et al.  A reuse framework for software fault tolerance , 1995 .

[9]  K. H. Kim,et al.  Adaptive fault-tolerance in complex real-time distributed computer system applications , 1992, Comput. Commun..

[10]  A. Avizienis,et al.  Dependable computing: From concepts to design diversity , 1986, Proceedings of the IEEE.

[11]  Paul Ammann,et al.  Data Diversity: An Approach to Software Fault Tolerance , 1988, IEEE Trans. Computers.

[12]  Thomas I. McVittie,et al.  Implementing design diversity to achieve fault tolerance , 1991, IEEE Software.

[13]  Grady Booch,et al.  Object-Oriented Analysis and Design with Applications , 1990 .