Analysis of the Abortion Rate on Lazy Replication Protocols

Shared memory applications are principal to solve a big number of problems in distributed systems. From high performance applications, where the different computational units use this technique to simplify its designs (and often improve the performance) to database applications, where a replicated database can also be considered as a flavor of shared memory for the different involved nodes. Any of these applications use replication as the basis for the implementation of shared memory, and they frequently share common characteristics in respect to access locality to particular portions of the global state. Replication is also a technique commonly used in distributed systems in order to provide fault tolerance. Many techniques have been designed to perform the necessary consistency management for the different views on a replicated memory system. Some of these techniques try to take advantadge of the access locality, by propagating the changes performed by any node in a lazy style (i.e. as late as possible). Nevertheless, lazy update protocols have proven to have an undesirable behavior due to their high abortion rate in scenarios with high degree of access conflicts. In this paper, we present the problem of the abortion rate in such protocols from a statistical point of view, in order to provide an expression capable to predict the probability for an object to be out of date during the execution of a transaction in a contextual environment. It is also suggested a pseudo-optimistic technique that makes use of this expression to reduce the abortion rate caused by accesses to out of date objects. The proposal is validated by means of an empirical study of the behavior of the expression, including measurements of a real implementation. Finally, we discuss the application of these results to improve lazy update protocols, providing a technique to determine the theoretical boundaries of the improvement.

[1]  Gustavo Alonso,et al.  Exploiting atomic broadcast in replicated databases , 1997 .

[2]  Barbara Liskov,et al.  Practical uses of synchronized clocks in distributed systems , 1991, PODC '91.

[3]  Sam Toueg,et al.  A Modular Approach to Fault-Tolerant Broadcasts and Related Problems , 1994 .

[4]  J. T. Robinson,et al.  On optimistic methods for concurrency control , 1979, TODS.

[5]  Alan L. Cox,et al.  TreadMarks: shared memory computing on networks of workstations , 1996 .

[6]  Raymond T. Yeh,et al.  Proceedings of the first international conference on systems integration on Systems integration '90 , 1990 .

[7]  Luís E. T. Rodrigues,et al.  The GlobData Fault-Tolerant Replicated Distributed Object Database , 2002, EurAsia-ICT.

[8]  Rajive L. Bagrodia,et al.  An integrated approach to the design and performance evaluation of distributed systems , 1990, Systems Integration '90. Proceedings of the First International Conference on Systems Integration.

[9]  Luis Irún Briz Implementable models for replicated and fault-tolerant geographically distributed databases. Consistency management for globdata , 2003 .

[10]  Luis Irún-Briz,et al.  GlobData: A Platform for Supporting Multiple Consistency Modes , 2002, ISDB.

[11]  Wenfei Fan,et al.  Keys with Upward Wildcards for XML , 2001, DEXA.

[12]  S. S. Ravi,et al.  Deferred updates and data placement in distributed databases , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[13]  Nigel P. Topham,et al.  A Limitation Study into Access Decoupling , 1997, Euro-Par.

[14]  ZwaenepoelWilly,et al.  Lazy release consistency for software distributed shared memory , 1992 .

[15]  Philip A. Bernstein,et al.  Concurrency control in a system for distributed databases (SDD-1) , 1980, TODS.

[16]  K. Mani Chandy,et al.  A Message-Based Approach to Discrete-Event Simulation , 1987, IEEE Transactions on Software Engineering.

[17]  Luis Irún-Briz,et al.  COPLA: A Platform for Eager and Lazy Replication in Networked Databases , 2003, ICEIS.

[18]  Alan L. Cox,et al.  Lazy release consistency for software distributed shared memory , 1992, ISCA '92.

[19]  Gustavo Alonso,et al.  Exploiting Atomic Broadcast in Replicated Databases (Extended Abstract) , 1997, Euro-Par.

[20]  K. Mani Chandy,et al.  Distributed Simulation: A Case Study in Design and Verification of Distributed Programs , 1979, IEEE Transactions on Software Engineering.

[21]  Dennis Shasha,et al.  The dangers of replication and a solution , 1996, SIGMOD '96.

[22]  Francisco Castro-Company,et al.  Enhancing the availability of networked database services by replication and consistency maintenance , 2003, 14th International Workshop on Database and Expert Systems Applications, 2003. Proceedings..

[23]  Luis Irún-Briz,et al.  An Improved Optimistic and Fault-Tolerant Replication Protocol , 2003, DNIS.

[24]  Divyakant Agrawal,et al.  Database replication: if you must be lazy, be consistent , 1999, Proceedings of the 18th IEEE Symposium on Reliable Distributed Systems.

[25]  Gustavo Alonso,et al.  Database replication techniques: a three parameter classification , 2000, Proceedings 19th IEEE Symposium on Reliable Distributed Systems SRDS-2000.

[26]  Erhard Rahm,et al.  Empirical performance evaluation of concurrency and coherency control protocols for database sharing systems , 1993, TODS.

[27]  Henry F. Korth,et al.  Replication and consistency: being lazy helps sometimes , 1997, PODS.