DeDiSys Lite: an environment for evaluating replication protocols in partitionable distributed object systems

Distributed object systems for partitionable systems present a challenge, in that there is a trade-off between availability and consistency. Changes in one partition are not visible in another partition. Therefore, if strong consistency is required, certain operations cannot be permitted. This reduces availability. In the DeDiSys project we aim at allowing this trade-off between consistency and availability to be configurable. The DeDiSys distributed object system relies heavily on replication protocols that allow high-availability, whilst ensuring a level of consistency that is required by a particular application. We have developed DeDiSys Lite, a prototype of the DeDiSys system, which provides a platform to implement and evaluate these replication protocols. Infrastructure components are provided in a minimal implementation. Configuration files allow system parameters, such as the degree of replication or the nesting of object invocations, to be modified, without having to adapt application code. We use DeDiSys Lite as both a simulation environment for the development of new replication protocols and as a basis for the continuous development of the DeDiSys system. Some results obtained using the platform to optimise a new replication protocol are presented in this paper.

[1]  Rachid Guerraoui,et al.  Software-Based Replication for Fault Tolerance , 1997, Computer.

[2]  K. Birman,et al.  Understanding Partitions and the \ No Partition " , 1993 .

[3]  Francesc D. Muñoz-Escoí,et al.  Increasing Availability in a Replicated Partitionable Distributed Object System , 2006, ISPA.

[4]  Louise E. Moser,et al.  Extended virtual synchrony , 1994, 14th International Conference on Distributed Computing Systems.

[5]  Kenneth P. Birman,et al.  Understanding partitions and the 'no partition' assumption , 1993, 1993 4th Workshop on Future Trends of Distributed Computing Systems.

[6]  Mesaac Makpangou,et al.  A Generic and Flexible Model for Replica Consistency Management , 2004, ICDCIT.

[7]  Fred B. Schneider,et al.  The primary-backup approach , 1993 .

[8]  Fred B. Schneider What good are models and what models are good , 1993 .

[9]  Amin Vahdat,et al.  Design and evaluation of a conit-based continuous consistency model for replicated services , 2002, TOCS.

[10]  William H. Sanders,et al.  AQuA: an adaptive architecture that provides dependable distributed objects , 1998, Proceedings Seventeenth IEEE Symposium on Reliable Distributed Systems (Cat. No.98CB36281).

[11]  E. B. Moss,et al.  Nested Transactions: An Approach to Reliable Distributed Computing , 1985 .

[12]  Yair Amir,et al.  A low latency, loss tolerant architecture and protocol for wide area group communication , 2000, Proceeding International Conference on Dependable Systems and Networks. DSN 2000.

[13]  Flaviu Cristian,et al.  Understanding fault-tolerant distributed systems , 1991, CACM.

[14]  Francesc D. Muñoz-Escoí,et al.  A system architecture for enhanced availability of tightly coupled distributed systems , 2006, First International Conference on Availability, Reliability and Security (ARES'06).

[15]  Francisco Castro-Company,et al.  MADIS: A Slim Middleware for Database Replication , 2005, Euro-Par.