CloudDT: Efficient tape resource management using deduplication in cloud backup and archival services

Cloud-based backup and archival services use large tape libraries as a cost-effective cold tier in their online storage hierarchy today. These services leverage deduplication to reduce the disk storage capacity required by their customer data sets, but they usually re-duplicate the data when moving it from disk to tape.

[1]  Mark Lillibridge,et al.  Extreme Binning: Scalable, parallel deduplication for chunk-based file backup , 2009, 2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems.

[2]  Ian Pratt,et al.  Proceedings of the General Track: 2004 USENIX Annual Technical Conference , 2004 .

[3]  Randal C. Burns,et al.  Efficient distributed backup with delta compression , 1997, IOPADS '97.

[4]  Dutch T. Meyer,et al.  A study of practical deduplication , 2011, TOS.

[5]  Sean Quinlan,et al.  Venti: A New Approach to Archival Storage , 2002, FAST.

[6]  Mark Newman,et al.  Networks: An Introduction , 2010 .

[7]  André Brinkmann,et al.  Multi-level comparison of data deduplication in a backup scenario , 2009, SYSTOR '09.

[8]  Robert H. McDonald,et al.  Disk and Tape Storage Cost Models , 2007, Archiving Conference.

[9]  Arnon Amir,et al.  The Linear Tape File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[10]  Ibm Redbooks IBM System Storage Tape Library Guide for Open Systems , 2011 .

[11]  Michal Kaczmarczyk,et al.  HYDRAstor: A Scalable Secondary Storage , 2009, FAST.

[12]  Vladimir Batagelj,et al.  Generalized Cores , 2002, ArXiv.

[13]  Mark Lillibridge,et al.  Sparse Indexing: Large Scale, Inline Deduplication Using Sampling and Locality , 2009, FAST.

[14]  Daniel Gruhl,et al.  IZO: Applications of Large-Window Compression to Virtual Machine Management , 2008, LISA.

[15]  Ethan L. Miller,et al.  The effectiveness of deduplication on virtual machine disk images , 2009, SYSTOR '09.

[16]  Petros Efstathopoulos,et al.  Building a High-performance Deduplication System , 2011, USENIX Annual Technical Conference.

[17]  Vladimir Batagelj,et al.  An O(m) Algorithm for Cores Decomposition of Networks , 2003, ArXiv.

[18]  Kai Li,et al.  Avoiding the Disk Bottleneck in the Data Domain Deduplication File System , 2008, FAST.