The research and implementation of a CORBA-based architecture for adaptive fault tolerance in distributed systems

The new generation of complex mission-critical systems (such as air traffic control systems, security monitoring systems and real time systems) is inherently distributed and operates in highly dynamic environments. Fault tolerance is a main means of assurance of system reliability. Single fault tolerance policy can not satisfy the dynamic changes of these systems, so their fault tolerance mechanism should provide more intelligence to adapt them to be in response to the changes in system resource, application demands and user requirements and improve resource utilization. This paper presents a CORBA-based architecture called AFTLSDS for adaptive fault tolerance in distributed systems and its design can satisfy the requirement of new generation complex mission-critical systems. We put emphasis on its component and design policy. Finally we give prototype implementation of this architecture and conclusion.

[1]  K. H. Kim,et al.  Action-level fault tolerance , 1995 .

[2]  Walter R. Bischofberger,et al.  Building Reliable Distributed Systems with CORBA , 1997, Theory Pract. Object Syst..

[3]  William H. Sanders,et al.  Proteus: a flexible infrastructure to implement adaptive fault tolerance in AQuA , 1999, Dependable Computing for Critical Applications 7.

[4]  Algirdas Avizienis,et al.  Software Fault Tolerance , 1989, IFIP Congress.

[5]  Sean Landis,et al.  Building Reliable Distributed Systems with CORBA , 1997, Theory Pract. Object Syst..

[6]  K. H. Kim,et al.  Adaptive fault-tolerance in complex real-time distributed computer system applications , 1992, Comput. Commun..

[7]  K. H. Kim,et al.  An Approach for Adaptive Fault Tolerance in Object-Oriented Open Distributed Systems , 1998, Int. J. Softw. Eng. Knowl. Eng..

[8]  Wanlei Zhou Proceedings fifth International Conference on algorithms and architectures for parallel processing , 2002 .

[9]  E. D. Jensen,et al.  Adaptive Fault-Resistant Systems , 1994 .

[10]  Ravishankar K. Iyer,et al.  Chameleon: A Software Infrastructure for Adaptive Fault Tolerance , 1999, IEEE Trans. Parallel Distributed Syst..

[11]  Karsten Schwan,et al.  Dynamic adaptation of real-time software , 1991, TOCS.

[12]  Flaviu Cristian,et al.  Understanding fault-tolerant distributed systems , 1991, CACM.

[13]  Lorenzo Strigini,et al.  Adaptable Fault Tolerance for Real-Time Systems , 1994, Responsive Computer Systems.

[14]  Angela M. Kitchen,et al.  Adaptive Fault Resistant System (AFRS). , 1995 .

[15]  K. H. Kim,et al.  Architecture of ROAFTS/Solaris: a Solaris-based middleware for real-time object-oriented adaptive fault tolerance support , 1998, Proceedings. The Twenty-Second Annual International Computer Software and Applications Conference (Compsac '98) (Cat. No.98CB 36241).