Increasing Availability in a Replicated Partitionable Distributed Object System

Replicating objects in distributed object systems provides fault-tolerance and increases availability. We have designed a replication protocol for distributed object systems that provides increased availability by relaxing consistency temporarily. The protocol allows all partitions in a partitioned system to continue operating. The states of certain replicas are allowed to diverge. The application programmer can specify the required consistency using integrity constraints. We present an analytical model of the new protocol and evaluate it against the primary partition model, where only a majority partition is allowed to continue. Furthermore, we identify the type of application for which our protocol provides increased availability.

[1]  Fred B. Schneider,et al.  The primary-backup approach , 1993 .

[2]  Francesc D. Muñoz-Escoí,et al.  DeDiSys Lite: an environment for evaluating replication protocols in partitionable distributed object systems , 2006, First International Conference on Availability, Reliability and Security (ARES'06).

[3]  Francesc D. Muñoz-Escoí,et al.  CORBA Replication Support for Fault-Tolerance in a Partitionable Distributed System , 2006, 17th International Workshop on Database and Expert Systems Applications (DEXA'06).

[4]  Kenneth P. Birman,et al.  Understanding partitions and the 'no partition' assumption , 1993, 1993 4th Workshop on Future Trends of Distributed Computing Systems.

[5]  Amin Vahdat,et al.  Design and evaluation of a conit-based continuous consistency model for replicated services , 2002, TOCS.

[6]  Sérgio Duarte,et al.  Mobile Transaction Management in Mobisnap , 2000, ADBIS-DASFAA.

[7]  Marvin Theimer,et al.  The Bayou Architecture: Support for Data Sharing Among Mobile Users , 1994, 1994 First Workshop on Mobile Computing Systems and Applications.

[8]  Mesaac Makpangou,et al.  A Generic and Flexible Model for Replica Consistency Management , 2004, ICDCIT.

[9]  Pablo Galdámez,et al.  Extended Membership Problem for Open Groups: Specification and Solution , 2004, VECPAR.

[10]  Dennis Shasha,et al.  The dangers of replication and a solution , 1996, SIGMOD '96.

[11]  Sam Toueg,et al.  Fault-tolerant broadcasts and related problems , 1993 .

[12]  Francesc D. Muñoz-Escoí,et al.  A system architecture for enhanced availability of tightly coupled distributed systems , 2006, First International Conference on Availability, Reliability and Security (ARES'06).

[13]  B. R. Badrinath,et al.  Multiversion reconciliation for mobile databases , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[14]  Anne-Marie Kermarrec,et al.  Application-independent reconciliation for nomadic applications , 2000, EW 9.

[15]  William H. Sanders,et al.  AQuA: an adaptive architecture that provides dependable distributed objects , 1998, Proceedings Seventeenth IEEE Symposium on Reliable Distributed Systems (Cat. No.98CB36281).

[16]  Rachid Guerraoui,et al.  Software-Based Replication for Fault Tolerance , 1997, Computer.

[17]  Flaviu Cristian,et al.  Understanding fault-tolerant distributed systems , 1991, CACM.