Implementing Network Partition-Aware Fault-Tolerant CORBA Systems

The current standard for fault-tolerance in the Common Object Request Broker Architecture (CORBA) does not support network partitioning. However, distributed systems, and those deployed on wide area networks in particular, are susceptible to network partitions. The contribution of this paper is the description of the design and implementation of a CORBA fault-tolerance add-on for partitionable environments. Our solution can be applied to an off-the-shelf Object Request Broker, without having access to the ORB's source code and with minimal changes to existing CORBA applications. The system distinguishes itself from existing solutions in the way different replication and reconciliation strategies can be implemented easily. Furthermore, we provide a novel replication and reconciliation protocol that increases the availability of systems, by allowing operations in all partitions to continue

[1]  Paul D. Ezhilchelvan,et al.  Design and implemantation of a CORBA fault-tolerant object group service , 1999, DAIS.

[2]  Louise E. Moser,et al.  Totem: a fault-tolerant multicast group communication system , 1996, CACM.

[3]  Kenneth P. Birman,et al.  Understanding partitions and the 'no partition' assumption , 1993, 1993 4th Workshop on Future Trends of Distributed Computing Systems.

[4]  K. Birman,et al.  Understanding Partitions and the \ No Partition " , 1993 .

[5]  Kenneth P. Birman,et al.  The Maestro Approach to Building Reliable Interoperable Distributed Applications with Multiple Execution Styles , 1998, Theory Pract. Object Syst..

[6]  Francesc D. Muñoz-Escoí,et al.  A system architecture for enhanced availability of tightly coupled distributed systems , 2006, First International Conference on Availability, Reliability and Security (ARES'06).

[7]  Fabíola Greve,et al.  Open eden: a portable fault tolerant CORBA architecture , 2003, Second International Symposium on Parallel and Distributed Computing, 2003. Proceedings..

[8]  Fred B. Schneider,et al.  The primary-backup approach , 1993 .

[9]  Fred B. Schneider What good are models and what models are good , 1993 .

[10]  William H. Sanders,et al.  AQuA: An Adaptive Architecture that Provides Dependable Distributed Objects , 2003, IEEE Trans. Computers.

[11]  William H. Sanders,et al.  AQuA: an adaptive architecture that provides dependable distributed objects , 1998, Proceedings Seventeenth IEEE Symposium on Reliable Distributed Systems (Cat. No.98CB36281).

[12]  Louise E. Moser,et al.  Design and Implementation of a Pluggable Fault-Tolerant CORBA Infrastructure , 2004, Cluster Computing.

[13]  Louise E. Moser,et al.  Extended virtual synchrony , 1994, 14th International Conference on Distributed Computing Systems.

[14]  Roy Friedman,et al.  FTS: a high-performance CORBA fault-tolerance service , 2002, Proceedings of the Seventh IEEE International Workshop on Object-Oriented Real-Time Dependable Systems. (WORDS 2002).

[15]  Rachid Guerraoui,et al.  The design of a CORBA group communication service , 1996, Proceedings 15th Symposium on Reliable Distributed Systems.

[16]  Flaviu Cristian,et al.  Understanding fault-tolerant distributed systems , 1991, CACM.

[17]  Rachid Guerraoui,et al.  Software-Based Replication for Fault Tolerance , 1997, Computer.

[18]  David Powell Distributed Fault Tolerance - Lessons Learned from Delta-4 , 1993, Hardware and Software Architectures for Fault Tolerance.

[19]  Francesc D. Muñoz-Escoí,et al.  Increasing Availability in a Replicated Partitionable Distributed Object System , 2006, ISPA.

[20]  Priya Narasimhan,et al.  Consistent Object Replication in the external System , 1998, Theory Pract. Object Syst..

[21]  Jean-Charles Fabre,et al.  Implementing simple replication protocols using CORBA portable interceptors and Java serialization , 2004, International Conference on Dependable Systems and Networks, 2004.

[22]  Roberto Baldoni,et al.  Three‐tier replication for FT‐CORBA infrastructures , 2003, Softw. Pract. Exp..

[23]  Hee Yong Youn,et al.  OCI-Based Group Communication Support in CORBA , 2003, IEEE Trans. Parallel Distributed Syst..

[24]  Sean Landis,et al.  Building Reliable Distributed Systems with CORBA , 1997, Theory Pract. Object Syst..

[25]  Walter R. Bischofberger,et al.  Building Reliable Distributed Systems with CORBA , 1997, Theory Pract. Object Syst..

[26]  Yair Amir,et al.  A low latency, loss tolerant architecture and protocol for wide area group communication , 2000, Proceeding International Conference on Dependable Systems and Networks. DSN 2000.

[27]  Lorenz Froihofer,et al.  Trading Integrity for Availability by Means of Explicit Runtime Constraints , 2006, 30th Annual International Computer Software and Applications Conference (COMPSAC'06).

[28]  Douglas C. Schmidt,et al.  The Design and Performance of a Pluggable Protocols Framework for CORBA Middleware , 1999, Protocols for High-Speed Networks.