Implementation of blocking coordinated atomic actions based on forward error recovery

Abstract The coordinated atomic action concept was proposed as a means for providing fault tolerance in complex objectoriented systems that incorporate both cooperative and competitive concurrency. This paper has two purposes: to discuss a particular implementation of this concept and to address a number of the implementation issues that are common to any experiments with this concept. Our implementation relies on a detailed set of programming conventions for the standard Ada 95 language and uses a scheme of forward error recovery incorporating concurrent exception handling and resolution. Ada 95 has a number of unique features which make it a particularly good choice for our experiments. We believe that our approach is practical and useful for many critical applications with high dependability requirements.

[1]  Flaviu Cristian,et al.  Exception Handling and Tolerance of Software Faults , 1995 .

[2]  Gregory V. Wilson,et al.  Parallel Programming Using C , 1996 .

[3]  Helmut Weber A microprogrammed implementation of EULER on IBM system/360 model 30 , 1967, CACM.

[4]  Brian Randell,et al.  System structure for software fault tolerance , 1975, IEEE Transactions on Software Engineering.

[5]  Jie Xu,et al.  Exception handling and resolution in distributed object-oriented systems , 1996, Proceedings of 16th International Conference on Distributed Computing Systems.

[6]  Klaus Samelson,et al.  Language Hierarchies and Interfaces , 1976, Lecture Notes in Computer Science.

[7]  Valérie Issarny An exception handling mechanism for parallel object-oriented programming , 1992 .

[8]  Michael R. Lyu Software Fault Tolerance , 1995 .

[9]  Alan Burns,et al.  Implementing Atomic Actions in Ada 95 , 1997, IEEE Trans. Software Eng..

[10]  K. H. Kim,et al.  Approaches to Mechanization of the Conversation Scheme Based on Monitors , 1982, IEEE Transactions on Software Engineering.

[11]  Brian Randell,et al.  Error recovery in asynchronous systems , 1986, IEEE Transactions on Software Engineering.

[12]  Cecília M. F. Rubira,et al.  Fault tolerance in concurrent object-oriented software through coordinated error recovery , 1995, Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[13]  Alan Burns,et al.  Concurrency in ADA , 1995 .

[14]  Alexander Romanovsky,et al.  Atomic Actions Based on Distributed/C oncurrent Exception Resolution , 1996 .

[15]  C. A. R. Hoare,et al.  Monitors: an operating system structuring concept , 1974, CACM.

[16]  Roy H. Campbell,et al.  Atomic actions for fault-tolerance using CSP , 1986, IEEE Transactions on Software Engineering.

[17]  C. A. R. Hoare,et al.  Parallel Programming: An Axiomatic Approach , 1975, Comput. Lang..

[18]  Hermann Kopetz,et al.  Fault tolerance, principles and practice , 1990 .

[19]  Brian Randell,et al.  Approaches to Software Fault Tolerance , 1993 .

[20]  Alexander Romaovsky Practical Exception Handling and Resolution in Concurrent Programs , 1997, Comput. Lang..

[21]  Andreas Reuter,et al.  Transaction Processing: Concepts and Techniques , 1992 .