File Versioning for Block-Level Continuous Data Protection

Block-level continuous data protection (CDP) logs every disk block update so that disk updates within a time window are undoable. Standard file servers and DBMS servers can enjoy the data protection service offered by block-level CDP without any modification. Unfortunately, no existing block-level CDP systems can provide users a file versioning view on top of the block versions they maintain. As a result, the data they maintain cannot be used as an extension to the on-line system with which users routinely interact. This paper describes a name-based user-level file versioning system called UVFS that is designed to reconstruct file versions from disk block versions maintained by a block-level CDP. UVFS reconstructs file versions by following the last modified time of files and directories, a common file metadata supported by almost all modern file systems, and therefore does not require any modification to the host file system that a block-level CDP system protects. In addition, UVFS incorporates a file system-specific incremental consistency check mechanism to quickly convert an arbitrary point-in-time block-level snapshot to a file system-consistent one. Performance measurements taken from a fully operational UVFS prototype show that the average end-to-end elapsed time required to discover a file version is under 50 msec from the perspective of an NFS client serviced by an NFS server backed by a block-level CDP system.

[1]  Peter A. Dinda,et al.  Wayback: A User-level Versioning File System for Linux (Awarded Best Paper!) , 2004, USENIX Annual Technical Conference, FREENIX Track.

[2]  Craig A. N. Soules,et al.  Metadata Efficiency in Versioning File Systems , 2003, FAST.

[3]  Tzi-cker Chiueh,et al.  Efficient Logging and Replication Techniques for Comprehensive Data Protection , 2007, 24th IEEE Conference on Mass Storage Systems and Technologies (MSST 2007).

[4]  Erez Zadok,et al.  A Versatile and User-Oriented Versioning File System , 2004, FAST.

[5]  Joachim M. Buhmann,et al.  Distortion Invariant Object Recognition in the Dynamic Link Architecture , 1993, IEEE Trans. Computers.

[6]  Andy Oram,et al.  Understanding the Linux Kernel, Second Edition , 2002 .

[7]  Jeffrey Katcher,et al.  PostMark: A New File System Benchmark , 1997 .

[8]  Margo I. Seltzer,et al.  Passive NFS Tracing of Email and Research Workloads , 2003, FAST.

[9]  Sean Matthew Dorward,et al.  Awarded Best Paper! - Venti: A New Approach to Archival Data Storage , 2002 .

[10]  Pradeep K. Khosla,et al.  Survivable Information Storage Systems , 2000, Computer.

[11]  James Lau,et al.  File System Design for an NFS File Server Appliance , 1994, USENIX Winter.

[12]  Craig A. N. Soules,et al.  Self-securing storage: protecting data in compromised systems , 2000, Foundations of Intrusion Tolerant Systems, 2003 [Organically Assured and Survivable Information Systems].

[13]  R. S. Fabry,et al.  A fast file system for UNIX , 1984, TOCS.

[14]  Erez Zadok,et al.  FIST: a language for stackable file systems , 2000, OPSR.

[15]  Tzi-cker Chiueh,et al.  Portable and Efficient Continuous Data Protection for Network File Servers , 2007, 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'07).

[16]  Michael L. Kazar,et al.  Synchronization and Caching Issues in the Andrew File System , 1988, USENIX Winter.

[17]  Marco Cesati,et al.  Understanding the Linux Kernel, Third Edition , 2005 .

[18]  Norman C. Hutchinson,et al.  Deciding when to forget in the Elephant file system , 1999, SOSP.

[19]  Tzi-cker Chiueh,et al.  An Incremental File System Consistency Checker for Block-Level CDP Systems , 2008, 2008 Symposium on Reliable Distributed Systems.

[20]  Paula Ta-Shma,et al.  Architectures for Controller Based CDP , 2007, FAST.

[21]  Tzi-cker Chiueh,et al.  TBBT: scalable and accurate trace replay for file server evaluation , 2005, SIGMETRICS '05.

[22]  Randal C. Burns,et al.  Ext3cow: a time-shifting file system for regulatory compliance , 2005, TOS.

[23]  Survivable information storage systems - Computer , 2000 .

[24]  Daniel Pierre Bovet,et al.  Understanding the Linux Kernel , 2000 .

[25]  Richard McDougall,et al.  Solaris Internals: Solaris 10 and OpenSolaris Kernel Architecture , 2006 .

[26]  Ken Thompson,et al.  The use of name spaces in Plan 9 , 1993, OPSR.

[27]  Bernhard Seeger,et al.  An asymptotically optimal multiversion B-tree , 1996, The VLDB Journal.

[28]  Angelos Bilas,et al.  Clotho: Transparent Data Versioning at the Block I/O Level , 2004, MSST.