Design and evaluation of a conit-based continuous consistency model for replicated services

The tradeoffs between consistency, performance, and availability are well understood. Traditionally, however, designers of replicated systems have been forced to choose from either strong consistency guarantees or none at all. This paper explores the semantic space between traditional strong and optimistic consistency models for replicated services. We argue that an important class of applications can tolerate relaxed consistency, but benefit from bounding the maximum rate of inconsistent access in an application-specific manner. Thus, we develop a conit-based continuous consistency model to capture the consistency spectrum using three application-independent metrics, numerical error, order error, and staleness. We then present the design and implementation of TACT, a middleware layer that enforces arbitrary consistency bounds among replicas using these metrics. We argue that the TACT consistency model can simultaneously achieve the often conflicting goals of generality and practicality by describing how a broad range of applications can express their consistency semantics using TACT and by demonstrating that application-independent algorithms can efficiently enforce target consistency levels. Finally, we show that three replicated applications running across the Internet demonstrate significant semantic and performance benefits from using our framework.

[1]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[2]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[3]  Hector Garcia-Molina,et al.  Read-only transactions in a distributed database , 1982, TODS.

[4]  E. F. Codd,et al.  A relational model of data for large shared data banks , 1970, CACM.

[5]  Hector Garcia-Molina,et al.  Consistency in a partitioned network: a survey , 1985, CSUR.

[6]  Brian A. Coan,et al.  Limitations on database availability when networks partition , 1986, PODC '86.

[7]  B. R. Badrinath,et al.  Semantics-based concurrency control: Beyond commutativity , 1987, 1987 IEEE Third International Conference on Data Engineering.

[8]  William E. Weihl,et al.  Commutativity-based concurrency control for abstract data types , 1988, [1988] Proceedings of the Twenty-First Annual Hawaii International Conference on System Sciences. Volume II: Software track.

[9]  Amr El Abbadi,et al.  Maintaining availability in partitioned replicated databases , 1987, ACM Trans. Database Syst..

[10]  Rafael Alonso,et al.  Data caching issues in an information retrieval system , 1990, TODS.

[11]  Maurice Herlihy,et al.  Linearizability: a correctness condition for concurrent objects , 1990, TOPL.

[12]  John S. Heidemann,et al.  Implementation of the Ficus Replicated File System , 1990, USENIX Summer.

[13]  Fred B. Schneider,et al.  Implementing fault-tolerant services using the state machine approach: a tutorial , 1990, CSUR.

[14]  Calton Pu,et al.  Replica control in distributed systems: as asynchronous approach , 1991, SIGMOD '91.

[15]  Richard A. Golding,et al.  Weak-consistency group communication and membership , 1992 .

[16]  Richard A. Golding A Weak-Consistency Architecture for Distributed Information Services , 1992, Comput. Syst..

[17]  Philip S. Yu,et al.  Divergence control for epsilon-serializability , 1992, [1992] Eighth International Conference on Data Engineering.

[18]  Mahadev Satyanarayanan,et al.  Disconnected operation in the Coda File System , 1992, TOCS.

[19]  Tei-Wei Kuo,et al.  Application semantics and concurrency control of real-time data-intensive applications , 1992, [1992] Proceedings Real-Time Systems Symposium.

[20]  Liuba Shrira,et al.  Providing high availability using lazy replication , 1992, TOCS.

[21]  Divyakant Agrawal,et al.  Tolerating bounded inconsistency for increasing concurrency in database systems , 1992, PODS '92.

[22]  Alan L. Cox,et al.  Lazy release consistency for software distributed shared memory , 1992, ISCA '92.

[23]  Lisa Cingiser DiPippo,et al.  Object-based semantic real-time concurrency control , 1993, 1993 Proceedings Real-Time Systems Symposium.

[24]  Philip S. Yu,et al.  Distributed divergence control for epsilon serializability , 1993, [1993] Proceedings. The 13th International Conference on Distributed Computing Systems.

[25]  Kenneth P. Birman,et al.  The process group approach to reliable distributed computing , 1992, CACM.

[26]  Ambuj K. Singh,et al.  Consistency and orderability: semantics-based correctness criteria for databases , 1993, TODS.

[27]  Tei-Wei Kuo,et al.  SSP: A semantics-based protocol for real-time data access , 1993, 1993 Proceedings Real-Time Systems Symposium.

[28]  Brian N. Bershad,et al.  Software write detection for a distributed shared memory , 1994, OSDI '94.

[29]  Arthur J. Bernstein,et al.  Bounded ignorance: a technique for increasing concurrency in a replicated system , 1994, TODS.

[30]  Divyakant Agrawal,et al.  Relative Serializbility: An Approach for Relaxing the Atomicity of Transactions. , 1994, PODS 1994.

[31]  Mahadev Satyanarayanan,et al.  Disconnected Operation in the Coda File System , 1999, Mobidata.

[32]  Steve Benford,et al.  Managing mutual awareness in collaborative virtual environments , 1994 .

[33]  Marvin Theimer,et al.  Session guarantees for weakly consistent replicated data , 1994, Proceedings of 3rd International Conference on Parallel and Distributed Information Systems.

[34]  Prasun Dewan,et al.  An editing-based characterization of the design space of collaborative applications , 1994 .

[35]  Brian N. Bershad,et al.  Extensibility safety and performance in the SPIN operating system , 1995, SOSP.

[36]  Robert Gruber,et al.  Efficient optimistic concurrency control using loosely synchronized clocks , 1995, SIGMOD '95.

[37]  Jun Rekimoto,et al.  Virtual Society: extending the WWW to support a multi-user interactive shared 3D environment , 1995, VRML '95.

[38]  Michael Anthony Bauer,et al.  Performance benefits of optimistic programming: a measure of HOPE , 1995, Proceedings of the Fourth IEEE International Symposium on High Performance Distributed Computing.

[39]  Calton Pu,et al.  Asynchronous consistency restoration under epsilon serializability , 1995, Proceedings of the Twenty-Eighth Annual Hawaii International Conference on System Sciences.

[40]  Matthias Nicola,et al.  Improving Performance in Replicated Databases through Relaxed Coherency , 1995, VLDB.

[41]  Bharat K. Bhargava,et al.  Maintaining consistency of data in mobile distributed environments , 1995, Proceedings of 15th International Conference on Distributed Computing Systems.

[42]  Marvin Theimer,et al.  Managing update conflicts in Bayou, a weakly connected replicated storage system , 1995, SOSP.

[43]  David A. Patterson,et al.  Computer Architecture - A Quantitative Approach, 5th Edition , 1996 .

[44]  David A. Patterson,et al.  Computer architecture (2nd ed.): a quantitative approach , 1996 .

[45]  Mustaque Ahamad,et al.  A scalable technique for implementing multiple consistency levels for distributed objects , 1996, Proceedings of 16th International Conference on Distributed Computing Systems.

[46]  Roy T. Fielding,et al.  Hypertext Transfer Protocol - HTTP/1.1 , 1997, RFC.

[47]  Divyakant Agrawal,et al.  Relative Serializability: An Approach for Relaxing the Atomicity of Transactions , 1997, J. Comput. Syst. Sci..

[48]  Jessica K. Hodgins,et al.  Temporal notions of synchronization and consistency in Beehive , 1997, SPAA '97.

[49]  Helen J. Wang,et al.  Online aggregation , 1997, SIGMOD '97.

[50]  Marvin Theimer,et al.  Flexible update propagation for weakly consistent replication , 1997, SOSP.

[51]  Marvin Theimer,et al.  Designing and implementing asynchronous collaborative applications with Bayou , 1997, UIST '97.

[52]  Robert Grimm,et al.  Application performance and flexibility on exokernel systems , 1997, SOSP.

[53]  Peter L. Reiher,et al.  Rumor: Mobile Data Access Through Optimistic Peer-to-Peer Replication , 1998, ER Workshops.

[54]  Christophe Diot,et al.  Design and evaluation of MiMaze a multi-player game on the Internet , 1998, Proceedings. IEEE International Conference on Multimedia Computing and Systems (Cat. No.98TB100241).

[55]  Erich M. Nahum,et al.  Locality-aware request distribution in cluster-based network servers , 1998, ASPLOS VIII.

[56]  B. Bershad,et al.  Manageability, availability and performance in Porcupine: a highly scalable, cluster-based mail service , 1999, SOSP.

[57]  Brian D. Noble,et al.  A Case for Fluid Replication , 1999 .

[58]  M LevyHenry,et al.  Manageability, availability and performance in Porcupine , 1999 .

[59]  Larry Rudolph,et al.  Commit-reconcile & fences (CRF): a new memory model for architects and compiler writers , 1999, ISCA.

[60]  Peter J. Keleher,et al.  Decentralized replicated-object protocols , 1999, PODC '99.

[61]  Mustaque Ahamad,et al.  Plausible clocks: constant size logical clocks for distributed systems , 1996, Distributed Computing.

[62]  Eric A. Brewer,et al.  Harvest, yield, and scalable tolerant systems , 1999, Proceedings of the Seventh Workshop on Hot Topics in Operating Systems.

[63]  Michel Raynal,et al.  Timed consistency for shared distributed objects , 1999, PODC '99.

[64]  J. Holliday,et al.  Epidemic quorums for managing replicated data , 2000, Conference Proceedings of the 2000 IEEE International Performance, Computing, and Communications Conference (Cat. No.00CH37086).

[65]  F. E.,et al.  A Relational Model of Data Large Shared Data Banks , 2000 .

[66]  Jennifer Widom,et al.  Offering a Precision-Performance Tradeoff for Aggregation Queries over Replicated Data , 2000, VLDB.

[67]  Patrick E. O'Neil,et al.  Generalized isolation level definitions , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[68]  Amin Vahdat,et al.  Efficient Numerical Error Bounding for Replicated Network Services , 2000, VLDB.