ST-CDP: Snapshots in TRAP for Continuous Data Protection

Continuous Data Protection (CDP) has become increasingly important as digitization continues. This paper presents a new architecture and an implementation of CDP in Linux kernel. The new architecture takes advantages of both traditional snapshot technology and recent Timely Recovery to Any Point-in-time (TRAP) architecture [CHECK END OF SENTENCE]. The idea is to periodically insert snapshots within the parity logs of changed data blocks in order to ensure fast and reliable data recovery in case of failures. A mathematical model is developed as a guide to designers to determine when and how to insert snapshots to optimize performance in terms of space usage and recovery time. Based on the mathematical model, we have designed and implemented a CDP module in the Linux system. Our implementation is at block level as a device driver that is capable of recovering data to any point-in-time in case of various failures. Extensive experiments have been carried out to show that the implementation is fairly robust and numerical results demonstrate that the implementation is efficient.

[1]  L Moses An Introductory Guide to TOPS-20. , 1982 .

[2]  David K. Gifford,et al.  The Cedar file system , 1988, CACM.

[3]  Kirby McCoy VMS File System Internals , 1990 .

[4]  Mendel Rosenblum,et al.  The design and implementation of a log-structured file system , 1991, SOSP '91.

[5]  Carl Staelin,et al.  An Implementation of a Log-Structured File System for UNIX , 1993, USENIX Winter.

[6]  Yiming Hu,et al.  DCD—disk caching disk: a new approach for boosting I/O performance , 1996, ISCA '96.

[7]  Jeffrey Katcher,et al.  PostMark: A New File System Benchmark , 1997 .

[8]  Kjetil Nørvåg,et al.  Log-only temporal object storage , 1997, Database and Expert Systems Applications. 8th International Conference, DEXA '97. Proceedings.

[9]  A. Chervenak,et al.  Protecting File Systems : A Survey of Backup Techniques , 1998 .

[10]  Norman C. Hutchinson,et al.  Deciding when to forget in the Elephant file system , 1999, SOSP.

[11]  David M. Smith,et al.  The cost of lost data , 2000 .

[12]  MaziéresDavid,et al.  A low-bandwidth network file system , 2001 .

[13]  Noah Treuhaft,et al.  Recovery Oriented Computing (ROC): Motivation, Definition, Techniques, and Case Studies , 2002 .

[14]  Brian D. Noble,et al.  Proceedings of the 5th Symposium on Operating Systems Design and Implementation Pastiche: Making Backup Cheap and Easy , 2022 .

[15]  Dirk Grunwald,et al.  Peabody: the time travelling disk , 2003, 20th IEEE/11th NASA Goddard Conference on Mass Storage Systems and Technologies, 2003. (MSST 2003). Proceedings..

[16]  Craig A. N. Soules,et al.  Metadata Efficiency in Versioning File Systems , 2003, FAST.

[17]  Ben Y. Zhao,et al.  Pond: The OceanStore Prototype , 2003, FAST.

[18]  Angelos Bilas,et al.  Clotho: Transparent Data Versioning at the Block I/O Level , 2004, MSST.

[19]  Dirk Beyer,et al.  Designing for Disasters , 2004, FAST.

[20]  Erez Zadok,et al.  A Versatile and User-Oriented Versioning File System , 2004, FAST.

[21]  Randal C. Burns,et al.  Ext3cow: a time-shifting file system for regulatory compliance , 2005, TOS.

[22]  Weijun Xiao,et al.  Implementation and Performance Evaluation of Two Snapshot Methods on iSCSI Target Storages , 2006 .

[23]  Diego R. Llanos tpcc-uva: an open-source implementation of the TPC-C benchmark , 2006 .

[24]  C. B. Morrey,et al.  Content-Based Block Caching , 2006 .

[25]  Qing Yang,et al.  TRAP-Array: A Disk Array Architecture Providing Timely Recovery to Any Point-in-time , 2006, ISCA 2006.

[26]  Tzi-cker Chiueh,et al.  Efficient Logging and Replication Techniques for Comprehensive Data Protection , 2007, 24th IEEE Conference on Mass Storage Systems and Technologies (MSST 2007).

[27]  Tzi-cker Chiueh,et al.  Portable and Efficient Continuous Data Protection for Network File Servers , 2007, 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'07).

[28]  Michael J. Feeley,et al.  Secure file system versioning at the block level , 2007, EuroSys '07.

[29]  Paula Ta-Shma,et al.  Architectures for Controller Based CDP , 2007, FAST.

[30]  Kai Li,et al.  Avoiding the Disk Bottleneck in the Data Domain Deduplication File System , 2008, FAST.

[31]  Muli Ben-Yehuda,et al.  Virtual machine time travel using continuous data protection and checkpointing , 2008, OPSR.

[32]  Xu Li,et al.  Optimal Implementation of Continuous Data Protection (CDP) in Linux Kernel , 2008, 2008 International Conference on Networking, Architecture, and Storage.