Towards a Scalable, Distributed Metadata Service for Causal Consistency under Partial Geo-replication

Causal consistency is a consistency criteria of practical relevance in geo-replicated settings because it provides well-defined semantics in a scalable manner. In fact, it has been proved that causal consistency is the strongest consistency model that can be enforced in an always-available system. Previous approaches to provide causal consistency, which successfully tackle the problem under full geo-replication, have unveiled the inherent tradeoff between the concurrency that the system allows and the size of the metadata needed to enforce causality. When the metadata is compressed, information about concurrency may be lost, creating false dependencies, i.e., the encoding may suggest a causal relation that does not exist in reality. False dependencies may cause artificial delays when processing requests, and decrease the quality of service experienced by the clients. Nevertheless, whether is possible to design a scalable solution that only uses an almost negligible amount of metadata and it is still capable of achieving high levels of concurrency under partial geo-replication, an increasingly relevant setting, remains as a challenging and interesting open research question. This position paper reports on the on-going development of Saturn, a metadata service for geo-replicated systems, that aims at mitigating the effects of false dependencies while keeping the metadata size small (even for challenging settings as partial geo-replication).

[1]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[2]  Marc Shapiro,et al.  Designing a causally consistent protocol for geo-distributed partial replication , 2015, PaPoC@EuroSys.

[3]  J. D. Day,et al.  A principle for resilient sharing of distributed resources , 1976, ICSE '76.

[4]  Sameh Elnikety,et al.  Orbe: scalable causal consistency using dependency matrices and physical clocks , 2013, SoCC.

[5]  Robbert van Renesse,et al.  Chain Replication for Supporting High Throughput and Availability , 2004, OSDI.

[6]  Willy Zwaenepoel,et al.  GentleRain: Cheap and Scalable Causal Consistency with Physical Clocks , 2014, SoCC.

[7]  André Schiper,et al.  Lightweight causal and atomic group multicast , 1991, TOCS.

[8]  Sérgio Duarte,et al.  Write Fast, Read in the Past: Causal Consistency for Client-Side Applications , 2015, Middleware.

[9]  Faith Ellen,et al.  Limitations of Highly-Available Eventually-Consistent Data Stores , 2015, IEEE Transactions on Parallel and Distributed Systems.

[10]  Luís E. T. Rodrigues,et al.  On the use of Clocks to Enforce Consistency in the Cloud , 2015, IEEE Data Eng. Bull..

[11]  Emin Gün Sirer,et al.  Kronos: the design and implementation of an event ordering service , 2014, EuroSys '14.

[12]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[13]  Lei Gao,et al.  PRACTI Replication , 2006, NSDI.

[14]  Min Zhu,et al.  B4: experience with a globally-deployed software defined wan , 2013, SIGCOMM.

[15]  Victor Grishchenko,et al.  Citrea and swarm: partially ordered op logs in the browser: implementing a collaborative editor and an object sync library in JavaScript , 2014, PaPEC '14.

[16]  Ali Ghodsi,et al.  The potential dangers of causal consistency and an explicit solution , 2012, SoCC '12.

[17]  Marvin Theimer,et al.  Session guarantees for weakly consistent replicated data , 1994, Proceedings of 3rd International Conference on Parallel and Distributed Information Systems.

[18]  Ajay D. Kshemkalyani,et al.  Causal Consistency for Geo-Replicated Cloud Storage under Partial Replication , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium Workshop.

[19]  Michael J. Freedman,et al.  Don't settle for eventual: scalable causal consistency for wide-area storage with COPS , 2011, SOSP.

[20]  Liuba Shrira,et al.  Providing high availability using lazy replication , 1992, TOCS.

[21]  João Leitão,et al.  ChainReaction: a causal+ consistent datastore based on chain replication , 2013, EuroSys '13.

[22]  Marvin Theimer,et al.  Flexible update propagation for weakly consistent replication , 1997, SOSP.

[23]  Michel Raynal,et al.  Fundamentals of Distributed Computing: A Practical Tour of Vector Clock Systems , 2002, IEEE Distributed Syst. Online.

[24]  Leslie Lamport,et al.  The part-time parliament , 1998, TOCS.

[25]  Márk Jelasity,et al.  PeerSim: A scalable P2P simulator , 2009, 2009 IEEE Ninth International Conference on Peer-to-Peer Computing.