The costs and limits of availability for replicated services

As raw system performance continues to improve at exponential rates, the utility of many services is increasingly limited by availability rather than performance. A key approach to improving availability involves replicating the service across multiple, wide-area sites. However, replication introduces well-known trade-offs between service consistency and availability. Thus, this article explores the benefits of dynamically trading consistency for availability using a continuous consistency model. In this model, applications specify a maximum deviation from strong consistency on a per-replica basis. In this article, we: i) evaluate the availability of a prototype replication system running across the Internet as a function of consistency level, consistency protocol, and failure characteristics, ii) demonstrate that simple optimizations to existing consistency protocols result in significant availability improvements (more than an order of magnitude in some scenarios), iii) use our experience with these optimizations to prove tight upper bound on the availability of services, and iv) show that maximizing availability typically entails remaining as close to strong consistency as possible during times of good connectivity, resulting in a communication versus availability trade-off.

[1]  Mahadev Satyanarayanan,et al.  Exploiting weak connectivity in a distributed file system , 1996 .

[2]  Peter J. Keleher,et al.  Decentralized replicated-object protocols , 1999, PODC '99.

[3]  Donald B. Johnson,et al.  A tight upper bound on the benefits of replication and consistency control protocols , 1991, PODS '91.

[4]  Hector Garcia-Molina,et al.  Optimizing the Reliability Provided by Voting Mechanisms , 1984, ICDCS.

[5]  Krishna P. Gummadi,et al.  Improving the Reliability of Internet Paths with One-hop Source Routing , 2004, OSDI.

[6]  John L. Hennessy,et al.  The Future of Systems Research , 1999, Computer.

[7]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[8]  A. Rosenthal Computing the Reliability of Complex Networks , 1977 .

[9]  Dennis Shasha,et al.  The dangers of replication and a solution , 1996, SIGMOD '96.

[10]  Brian A. Coan,et al.  Limitations on database availability when networks partition , 1986, PODC '86.

[11]  David K. Gifford,et al.  Weighted voting for replicated data , 1979, SOSP '79.

[12]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[13]  Brian N. Bershad,et al.  Recovering device drivers , 2004, TOCS.

[14]  Michalis Faloutsos,et al.  On power-law relationships of the Internet topology , 1999, SIGCOMM '99.

[15]  Yair Amir,et al.  Optimal Availability Quorum Systems: Theory and Practice , 1998, Inf. Process. Lett..

[16]  David A. Patterson,et al.  Towards Availability Benchmarks: A Case Study of Software RAID Systems , 2000, USENIX Annual Technical Conference, General Track.

[17]  Hector Garcia-Molina,et al.  The vulnerability of vote assignments , 1986, TOCS.

[18]  Amin Vahdat,et al.  Efficient Numerical Error Bounding for Replicated Network Services , 2000, VLDB.

[19]  Richard A. Golding A Weak-Consistency Architecture for Distributed Information Services , 1992, Comput. Syst..

[20]  Erich M. Nahum,et al.  Locality-aware request distribution in cluster-based network servers , 1998, ASPLOS VIII.

[21]  Bernard Mans,et al.  Optimal Coteries and Voting Schemes , 1994, Inf. Process. Lett..

[22]  Marvin Theimer,et al.  Flexible update propagation for weakly consistent replication , 1997, SOSP.

[23]  Akhil Kumar,et al.  Cost and availability tradeoffs in replicated data concurrency control , 1993, TODS.

[24]  Amin Vahdat,et al.  The costs and limits of availability for replicated services , 2001, TOCS.

[25]  George Candea,et al.  Microreboot - A Technique for Cheap Recovery , 2004, OSDI.

[26]  Robert H. Thomas,et al.  A Majority consensus approach to concurrency control for multiple copy databases , 1979, ACM Trans. Database Syst..

[27]  Calton Pu,et al.  Replica control in distributed systems: as asynchronous approach , 1991, SIGMOD '91.

[28]  Ugur Çetintemel,et al.  Support for speculative update propagation and mobility in Deno , 2001, Proceedings 21st International Conference on Distributed Computing Systems.

[29]  Piotr Berman,et al.  Voting as the Optimal Static Pessimistic Scheme for Managing Replicated Data , 1994, IEEE Trans. Parallel Distributed Syst..

[30]  Michael Dahlin,et al.  End-to-end WAN service availability , 2001, TNET.

[31]  Liuba Shrira,et al.  Providing high availability using lazy replication , 1992, TOCS.

[32]  David Peleg,et al.  The Availability of Quorum Systems , 1995, Inf. Comput..

[33]  Butler W. Lampson,et al.  How to Build a Highly Available System Using Consensus , 1996, WDAG.

[34]  Peter Reiher,et al.  Perspectives on optimistically replicated, peer‐to‐peer filing , 1998 .

[35]  Roger Wattenhofer,et al.  Competitive Hill-Climbing Strategies for Replica Placement in a Distributed File System , 2001, DISC.

[36]  Hector Garcia-Molina,et al.  The Reliability of Voting Mechanisms , 1987, IEEE Transactions on Computers.

[37]  Hari Balakrishnan,et al.  Resilient overlay networks , 2001, SOSP.

[38]  Jessica K. Hodgins,et al.  Temporal notions of synchronization and consistency in Beehive , 1997, SPAA '97.

[39]  Amin Vahdat,et al.  Combining generality and practicality in a conit-based continuous consistency model for wide-area replication , 2001, Proceedings 21st International Conference on Distributed Computing Systems.

[40]  Eric A. Brewer,et al.  Cluster-based scalable network services , 1997, SOSP.

[41]  Brian D. Noble,et al.  A Case for Fluid Replication , 1999 .

[42]  Michel Raynal,et al.  Timed consistency for shared distributed objects , 1999, PODC '99.

[43]  Vern Paxson End-to-end routing behavior in the internet , 2006, Comput. Commun. Rev..

[44]  Randy H. Katz,et al.  Trace-based mobile network emulation , 1997, SIGCOMM '97.

[45]  Arthur J. Bernstein,et al.  Bounded ignorance: a technique for increasing concurrency in a replicated system , 1994, TODS.

[46]  Amin Vahdat,et al.  Design and evaluation of a continuous consistency model for replicated services , 2000, OSDI.

[47]  Hari Balakrishnan,et al.  Improving web availability for clients with MONET , 2005, NSDI.

[48]  Donald B. Johnson,et al.  Effects of Replication on Data Availability , 1991, Int. J. Comput. Simul..

[49]  Amin Vahdat,et al.  Design and evaluation of a conit-based continuous consistency model for replicated services , 2002, TOCS.

[50]  Patrick E. O'Neil,et al.  Generalized isolation level definitions , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[51]  Richard Y. Kain,et al.  Vote assignments in weighted voting mechanisms , 1988, Proceedings [1988] Seventh Symposium on Reliable Distributed Systems.

[52]  Stefan Savage,et al.  The end-to-end effects of Internet path selection , 1999, SIGCOMM '99.

[53]  Eric A. Brewer,et al.  Harvest, yield, and scalable tolerant systems , 1999, Proceedings of the Seventh Workshop on Hot Topics in Operating Systems.

[54]  Yair Amir,et al.  Evaluating quorum systems over the Internet , 1996, Proceedings of Annual Symposium on Fault Tolerant Computing.

[55]  Mahadev Satyanarayanan,et al.  Disconnected operation in the Coda File System , 1992, TOCS.

[56]  Marvin Theimer,et al.  Managing update conflicts in Bayou, a weakly connected replicated storage system , 1995, SOSP.

[57]  Ellen W. Zegura,et al.  A quantitative comparison of graph-based models for Internet topology , 1997, TNET.

[58]  Mary Baker,et al.  Measurements of a distributed file system , 1991, SOSP '91.

[59]  Brian N. Bershad,et al.  Improving the reliability of commodity operating systems , 2005, TOCS.