论文信息 - Scalable, near-zero loss disaster recovery for distributed data stores

Scalable, near-zero loss disaster recovery for distributed data stores

This paper presents a new Disaster Recovery (DR) system, called Slogger, that differs from prior works in two principle ways: (i) Slogger enables DR for a linearizable distributed data store, and (ii) Slogger adopts the continuous backup approach that strives to maintain a tiny lag on the backup site relative to the primary site, thereby restricting the data loss window, due to disasters, to milliseconds. These goals pose a significant set of challenges related to consistency of the backup site's state, failures, and scalability. Slogger employs a combination of asynchronous log replication, intra-data center synchronized clocks, pipelining, batching, and a novel watermark service to address these challenges. Furthermore, Slogger is designed to be deployable as an "add-on" module in an existing distributed data store with few modifications to the original code base. Our evaluation, conducted on Slogger extensions to a 32-sharded version of LogCabin, an open source key-value store, shows that Slogger maintains a very small data loss window of 14.2 milliseconds which is near the optimal value in our evaluation setup. Moreover, Slogger reduces the length of the data loss window by 50% compared to incremental snapshotting technique without having any performance penalty on the primary data store. Furthermore, our experiments demonstrate that Slogger achieves our other goals of scalability, fault tolerance, and efficient failover to the backup data store when a disaster is declared at the primary data store.

[1] Maurice Herlihy,et al. Linearizability: a correctness condition for concurrent objects , 1990, TOPL.

[2] Onur Mutlu,et al. Gaia: Geo-Distributed Machine Learning Approaching LAN Speeds , 2017, NSDI.

[3] Pedro Moreira,et al. White rabbit: Sub-nanosecond timing distribution over ethernet , 2009, 2009 International Symposium on Precision Clock Synchronization for Measurement, Control and Communication.

[4] Zafer Korkmaz,et al. Understanding and applying precision time protocol , 2015, 2015 Saudi Arabia Smart Grid (SASG).

[5] Anurag Gupta,et al. Amazon Aurora: Design Considerations for High Throughput Cloud-Native Relational Databases , 2017, SIGMOD Conference.

[6] Leslie Lamport,et al. The part-time parliament , 1998, TOCS.

[7] Hector Garcia-Molina,et al. Management of a remote backup copy for disaster recovery , 1991, TODS.

[8] Christopher Frost,et al. Spanner: Google's Globally-Distributed Database , 2012, OSDI.

[9] A. Chervenak,et al. Protecting File Systems : A Survey of Backup Techniques , 1998 .

[10] David B. Lomet,et al. High speed on-line backup when using logical log operations , 2000, SIGMOD '00.

[11] Dirk Beyer,et al. Designing for Disasters , 2004, FAST.

[12] Lakshmi Ganesh,et al. Smoke and Mirrors: Reflecting Files at a Geographically Remote Location Without Loss of Performance , 2009, FAST.

[13] John Wilkes,et al. Seneca: remote mirroring done write , 2003, USENIX Annual Technical Conference, General Track.

[14] Hector Garcia-Molina,et al. Overview of disaster recovery for transaction processing systems , 1990, Proceedings.,10th International Conference on Distributed Computing Systems.

[15] Hamid Pirahesh,et al. ARIES: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging , 1998 .