Unobtrusive Deferred Update Stabilization for Efficient Geo-Replication

In this paper we propose a novel approach to manage the throughput vs latency tradeoff that emerges when managing updates in geo-replicated systems. Our approach consists in allowing full concurrency when processing local updates and using a deferred local serialisation procedure before shipping updates to remote datacenters. This strategy allows to implement inexpensive mechanisms to ensure system consistency requirements while avoiding intrusive effects on update operations, a major performance limitation of previous systems. We have implemented our approach as a variant of Riak KV. Our extensive evaluation shows that we outperform sequencer-based approaches by almost an order of magnitude in the maximum achievable throughput. Furthermore, unlike previous sequencer-free solutions, our approach reaches nearly optimal remote update visibility latencies without limiting throughput.

[1]  Fernando Pedone,et al.  Clock-RSM: Low-Latency Inter-datacenter State Machine Replication Using Loosely Synchronized Physical Clocks , 2014, 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.

[2]  Annette Bieniusa,et al.  SwiftCloud: Fault-Tolerant Geo-Replication Integrated all the Way to the Client Machine , 2013, 2014 IEEE 33rd International Symposium on Reliable Distributed Systems Workshops.

[3]  Ali Ghodsi,et al.  Highly Available Transactions: Virtues and Limitations , 2013, Proc. VLDB Endow..

[4]  Fred B. Schneider,et al.  Implementing fault-tolerant services using the state machine approach: a tutorial , 1990, CSUR.

[5]  Lei Gao,et al.  PRACTI Replication , 2006, NSDI.

[6]  Albert G. Greenberg,et al.  VL2: a scalable and flexible data center network , 2009, SIGCOMM '09.

[7]  Marcos K. Aguilera,et al.  Transactional storage for geo-replicated systems , 2011, SOSP.

[8]  J. D. Day,et al.  A principle for resilient sharing of distributed resources , 1976, ICSE '76.

[9]  Amin Vahdat,et al.  A scalable, commodity data center network architecture , 2008, SIGCOMM '08.

[10]  Pekka Aavikko,et al.  Network Time Protocol , 2010 .

[11]  Sérgio Duarte,et al.  Putting consistency back into eventual consistency , 2015, EuroSys.

[12]  Arne Andersson General Balanced Trees , 1999, J. Algorithms.

[13]  Willy Zwaenepoel,et al.  GentleRain: Cheap and Scalable Causal Consistency with Physical Clocks , 2014, SoCC.

[14]  André Schiper,et al.  Lightweight causal and atomic group multicast , 1991, TOCS.

[15]  Murat Demirbas,et al.  Logical Physical Clocks , 2014, OPODIS.

[16]  Lorenzo Alvisi,et al.  Consistency , Availability , and Convergence , 2011 .

[17]  Marc Shapiro,et al.  Designing a causally consistent protocol for geo-distributed partial replication , 2015, PaPoC@EuroSys.

[18]  Prashant Malik,et al.  Cassandra: a decentralized structured storage system , 2010, OPSR.

[19]  Sameh Elnikety,et al.  Orbe: scalable causal consistency using dependency matrices and physical clocks , 2013, SoCC.

[20]  Nancy A. Lynch,et al.  Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services , 2002, SIGA.

[21]  NoSQL Data Modeling Techniques , 2014 .

[22]  Sam Toueg,et al.  The weakest failure detector for solving consensus , 1992, PODC '92.

[23]  Marvin Theimer,et al.  Session guarantees for weakly consistent replicated data , 1994, Proceedings of 3rd International Conference on Parallel and Distributed Information Systems.

[24]  Bengt Karlöf,et al.  Benchmarking , 1998, Performance.

[25]  Gil Neiger,et al.  Causal memory meets the consistency and performance needs of distributed applications! , 1994, EW 6.

[26]  Michael J. Freedman,et al.  Don't settle for eventual: scalable causal consistency for wide-area storage with COPS , 2011, SOSP.

[27]  Leonidas J. Guibas,et al.  A dichromatic framework for balanced trees , 1978, 19th Annual Symposium on Foundations of Computer Science (sfcs 1978).

[28]  Hagit Attiya,et al.  Sequential consistency versus linearizability , 1994, TOCS.

[29]  Sameh Elnikety,et al.  Clock-SI: Snapshot Isolation for Partitioned Data Stores Using Loosely Synchronized Clocks , 2013, 2013 IEEE 32nd International Symposium on Reliable Distributed Systems.

[30]  Ali Ghodsi,et al.  The potential dangers of causal consistency and an explicit solution , 2012, SoCC '12.

[31]  Maurice Herlihy,et al.  Linearizability: a correctness condition for concurrent objects , 1990, TOPL.

[32]  Rachid Guerraoui,et al.  Trade-offs in Replicated Systems , 2016, IEEE Data Eng. Bull..

[33]  Cheng Li,et al.  Making geo-replicated systems fast as possible, consistent when necessary , 2012, OSDI 2012.

[34]  João Leitão,et al.  ChainReaction: a causal+ consistent datastore based on chain replication , 2013, EuroSys '13.

[35]  Gil Neiger,et al.  Causal memory: definitions, implementation, and programming , 1995, Distributed Computing.

[36]  Marvin Theimer,et al.  Flexible update propagation for weakly consistent replication , 1997, SOSP.

[37]  Robbert van Renesse,et al.  Chain Replication for Supporting High Throughput and Availability , 2004, OSDI.

[38]  B SchneiderFred Implementing fault-tolerant services using the state machine approach: a tutorial , 1990 .

[39]  AttiyaHagit,et al.  Limitations of Highly-Available Eventually-Consistent Data Stores , 2017 .

[40]  Peter Van Roy,et al.  Towards a Scalable, Distributed Metadata Service for Causal Consistency under Partial Geo-replication , 2015, Middleware Doctoral Symposium.

[41]  Leslie Lamport,et al.  The part-time parliament , 1998, TOCS.

[42]  Annette Bieniusa,et al.  Cure: Strong Semantics Meets High Availability and Low Latency , 2016, 2016 IEEE 36th International Conference on Distributed Computing Systems (ICDCS).

[43]  Michael J. Freedman,et al.  Stronger Semantics for Low-Latency Geo-Replicated Storage , 2013, NSDI.

[44]  Dahlia Malkhi,et al.  CORFU: A Shared Log Design for Flash Clusters , 2012, NSDI.

[45]  M. AdelsonVelskii,et al.  AN ALGORITHM FOR THE ORGANIZATION OF INFORMATION , 1963 .

[46]  Luís E. T. Rodrigues,et al.  On the use of Clocks to Enforce Consistency in the Cloud , 2015, IEEE Data Eng. Bull..

[47]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[48]  Liuba Shrira,et al.  Providing high availability using lazy replication , 1992, TOCS.

[49]  Sérgio Duarte,et al.  Write Fast, Read in the Past: Causal Consistency for Client-Side Applications , 2015, Middleware.

[50]  Faith Ellen,et al.  Limitations of Highly-Available Eventually-Consistent Data Stores , 2015, IEEE Transactions on Parallel and Distributed Systems.

[51]  Marc Shapiro,et al.  Conflict-Free Replicated Data Types , 2011, SSS.

[52]  Angel Bravo Gestoso Unobtrusive Deferred Update Stabilization for Efficient Geo-Replication , 2017, USENIX ATC 2017.