Architectures for Controller Based CDP

Continuous Data Protection (CDP) is a recent storage technology which enables reverting the state of the storage to previous points in time. We propose four alternative architectures for supporting CDP in a storage controller, and compare them analytically with respect to both write performance and space usage overheads. We describe exactly how factors such as the degree of protection granularity (continuous or at fixed intervals) and the temporal distance distribution of the given workload affect these overheads. Our model allows predicting the CDP overheads for arbitrary workloads and concluding the best architecture for a given scenario. Our analysis is verified by running a prototype CDP enabled block device on both synthetic and traced workloads and comparing the outcome with our analysis. Our work is the first to consider how performance is affected by varying the degree of protection granularity, both analytically and empirically. In addition it is the first to precisely quantify the natural connection between CDP overheads and a workload's temporal locality. We show that one of the architectures we considered is superior for workloads exhibiting high temporal locality w.r.t. granularity, whereas another of the architectures is superior for workloads exhibiting low temporal locality w.r.t. granularity. We analyze two specific workloads, an OLTP workload and a file server workload, and show which CDP architecture is superior for each workload at which granularities.

[1]  Craig A. N. Soules,et al.  Metadata Efficiency in Versioning File Systems , 2003, FAST.

[2]  Jaishankar Moothedath Menon,et al.  A performance comparison of RAID-5 and log-structured arrays , 1995, Proceedings of the Fourth IEEE International Symposium on High Performance Distributed Computing.

[3]  James Lau,et al.  File System Design for an NFS File Server Appliance , 1994, USENIX Winter.

[4]  Alain Azagury Point-in-Time Copy: Yesterday, Today and Tomorrow , 2002 .

[5]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[6]  Joseph S. Glider,et al.  The software architecture of a SAN storage control system , 2003, IBM Syst. J..

[7]  Mendel Rosenblum,et al.  The design and implementation of a log-structured file system , 1991, SOSP '91.

[8]  Angelos Bilas,et al.  Clotho: Transparent Data Versioning at the Block I/O Level , 2004, MSST.

[9]  Bruce McNutt,et al.  A Standard Test of I/O Cache , 2001, Int. CMG Conference.

[10]  Dirk Grunwald,et al.  Peabody: the time travelling disk , 2003, 20th IEEE/11th NASA Goddard Conference on Mass Storage Systems and Technologies, 2003. (MSST 2003). Proceedings..

[11]  Dharmendra S. Modha,et al.  WOW: wise ordering for writes - combining spatial and temporal locality in non-volatile caches , 2005, FAST'05.

[12]  John Wilkes,et al.  UNIX Disk Access Patterns , 1993, USENIX Winter.

[13]  Yuanyuan Zhou,et al.  The Multi-Queue Replacement Algorithm for Second Level Buffer Caches , 2001, USENIX Annual Technical Conference, General Track.

[14]  M. Hartung IBM TotalStorage Enterprise Storage Server: A designer's view , 2003, IBM Syst. J..

[15]  Randal C. Burns,et al.  Ext3cow: a time-shifting file system for regulatory compliance , 2005, TOS.

[16]  David K. Gifford,et al.  A caching file system for a programmer's workstation , 1985, SOSP '85.

[17]  Norman C. Hutchinson,et al.  Deciding when to forget in the Elephant file system , 1999, SOSP.

[18]  Nimrod Megiddo,et al.  ARC: A Self-Tuning, Low Overhead Replacement Cache , 2003, FAST.

[19]  Qing Yang,et al.  TRAP-Array: A Disk Array Architecture Providing Timely Recovery to Any Point-in-time , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).