Performance Sensitive Replication in Geo-distributed Cloud Datastores

Modern web applications face stringent requirements along many dimensions including latency, scalability, and availability. In response, several geo-distributed cloud data stores have emerged in recent years. Customizing data stores to meet application SLAs is challenging given the scale of applications, and their diverse and dynamic workloads. In this paper, we tackle these challenges in the context of quorum-based systems (e.g. Amazon Dynamo, Cassandra), an important class of cloud storage systems. We present models that optimize percentiles of response time under normal operation and under a data-center (DC) failure. Our models consider factors like the geographic spread of users, DC locations, consistency requirements and inter-DC communication costs. We evaluate our models using real-world traces of three applications: Twitter, Wikipedia and Go Walla on a Cassandra cluster deployed in Amazon EC2. Our results confirm the importance and effectiveness of our models, and highlight the benefits of customizing replication in cloud datastores.

[1]  Werner Vogels,et al.  Building reliable distributed systems at a worldwide scale demands trade-offs between consistency and availability. , 2022 .

[2]  Ada Wai-Chee Fu Delay-Optimal Quorum Consensus for Distributed Systems , 1997, IEEE Trans. Parallel Distributed Syst..

[3]  U. G. Knight,et al.  Power Systems in Emergencies: From Contingency Planning to Crisis Management , 2001 .

[4]  David R. Choffnes,et al.  Drafting Behind Akamai: Inferring Network Conditions Based on CDN Redirections , 2009, IEEE/ACM Transactions on Networking.

[5]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[6]  Yair Amir,et al.  Evaluating quorum systems over the Internet , 1996, Proceedings of Annual Symposium on Fault Tolerant Computing.

[7]  Hector Garcia-Molina,et al.  The Reliability of Voting Mechanisms , 1987, IEEE Transactions on Computers.

[8]  Marcos K. Aguilera,et al.  Surviving Congestion in Geo-Distributed Storage Systems , 2012, USENIX Annual Technical Conference.

[9]  Alexander Barvinok,et al.  A course in convexity , 2002, Graduate studies in mathematics.

[10]  Tatsuhiro Tsuchiya,et al.  Minimizing the Maximum Delay for Reaching Consensus in Quorum-Based Mutual Exclusion Schemes , 1999, IEEE Trans. Parallel Distributed Syst..

[11]  Keith Marzullo,et al.  Coterie Availability in Sites , 2005, DISC.

[12]  Jure Leskovec,et al.  Friendship and mobility: user movement in location-based social networks , 2011, KDD.

[13]  Michael K. Reiter,et al.  Minimizing Response Time for Quorum-System Protocols over Wide-Area Networks , 2007, 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'07).

[14]  Navendu Jain,et al.  Understanding network failures in data centers: measurement, analysis, and implications , 2011, SIGCOMM.

[15]  Christopher Frost,et al.  Spanner: Google's Globally-Distributed Database , 2012, OSDI.

[16]  Michael J. Freedman,et al.  Stronger Semantics for Low-Latency Geo-Replicated Storage , 2013, NSDI.

[17]  Van-Anh Truong,et al.  Availability in Globally Distributed Storage Systems , 2010, OSDI.

[18]  Nancy A. Lynch,et al.  Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services , 2002, SIGA.

[19]  Mostafa H. Ammar,et al.  Optimizing vote and quorum assignments for reading and writing replicated data , 1989, [1989] Proceedings. Fifth International Conference on Data Engineering.

[20]  Ethan Katz-Bassett,et al.  SPANStore: cost-effective geo-replicated storage spanning multiple cloud services , 2013, SOSP.

[21]  Lili Qiu,et al.  On the placement of Web server replicas , 2001, Proceedings IEEE INFOCOM 2001. Conference on Computer Communications. Twentieth Annual Joint Conference of the IEEE Computer and Communications Society (Cat. No.01CH37213).

[22]  J. Meeraus A. Bisschop,et al.  ON THE DEVELOPMENT OF A GENERAL ALGEBRAIC MODELING SYSTEM IN A STRATEGIC PLANNING ENVIRONMENT , 1982 .

[23]  Hui Ding,et al.  TAO: Facebook's Distributed Data Store for the Social Graph , 2013, USENIX Annual Technical Conference.

[24]  Hans-Arno Jacobsen,et al.  PNUTS: Yahoo!'s hosted data serving platform , 2008, Proc. VLDB Endow..

[25]  Prashant Malik,et al.  Cassandra: a decentralized structured storage system , 2010, OPSR.

[26]  Emin Gün Sirer,et al.  HyperDex: a distributed, searchable key-value store , 2012, SIGCOMM '12.

[27]  Ion Stoica,et al.  Probabilistically Bounded Staleness for Practical Partial Quorums , 2012, Proc. VLDB Endow..

[28]  Michael K. Reiter,et al.  When and How to Change Quorums on Wide Area Networks , 2009, 2009 28th IEEE International Symposium on Reliable Distributed Systems.

[29]  Pablo Rodriguez,et al.  The little engine(s) that could: scaling online social networks , 2010, SIGCOMM '10.

[30]  Hector Garcia-Molina,et al.  How to assign votes in a distributed system , 1985, JACM.

[31]  Rui Wang,et al.  Towards social user profiling: unified and discriminative influence model for inferring home locations , 2012, KDD.

[32]  Sanjay G. Rao,et al.  Balancing latency and availability in geo-distributed cloud data stores , 2013 .

[33]  Yawei Li,et al.  Megastore: Providing Scalable, Highly Available Storage for Interactive Services , 2011, CIDR.

[34]  Fernando Pedone,et al.  Geo-replicated storage with scalable deferred update replication , 2013, 2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[35]  Marcos K. Aguilera,et al.  Transactional storage for geo-replicated systems , 2011, SOSP.

[36]  Alec Wolman,et al.  Volley: Automated Data Placement for Geo-Distributed Cloud Services , 2010, NSDI.

[37]  Viktor K. Prasanna,et al.  Constant Time Algorithms for Computational Geometry on the Reconfigurable Mesh , 1997, IEEE Trans. Parallel Distributed Syst..

[38]  Michael J. Freedman,et al.  Don't settle for eventual: scalable causal consistency for wide-area storage with COPS , 2011, SOSP.

[39]  Tim Kraska,et al.  MDCC: multi-data center consistency , 2012, EuroSys '13.

[40]  David K. Gifford,et al.  Weighted voting for replicated data , 1979, SOSP '79.

[41]  E BustamanteFabián,et al.  Drafting behind Akamai , 2009 .