Avoiding the Disk Bottleneck in the Data Domain Deduplication File System

Disk-based deduplication storage has emerged as the new-generation storage system for enterprise data protection to replace tape libraries. Deduplication removes redundant data segments to compress data into a highly compact form and makes it economical to store backups on disk instead of tape. A crucial requirement for enterprise data protection is high throughput, typically over 100 MB/sec, which enables backups to complete quickly. A significant challenge is to identify and eliminate duplicate data segments at this rate on a low-cost system that cannot afford enough RAM to store an index of the stored segments and may be forced to access an on-disk index for every input segment. This paper describes three techniques employed in the production Data Domain deduplication file system to relieve the disk bottleneck. These techniques include: (1) the Summary Vector, a compact in-memory data structure for identifying new segments; (2) Stream-Informed Segment Layout, a data layout method to improve on-disk locality for sequentially accessed segments; and (3) Locality Preserved Caching, which maintains the locality of the fingerprints of duplicate segments to achieve high cache hit ratios. Together, they can remove 99% of the disk accesses for deduplication of real world workloads. These techniques enable a modern two-socket dual-core system to run at 90% CPU utilization with only one shelf of 15 disks and achieve 100 MB/sec for single-stream throughput and 210 MB/sec for multi-stream throughput.

[1]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[2]  Abraham Lempel,et al.  A universal algorithm for sequential data compression , 1977, IEEE Trans. Inf. Theory.

[3]  Carl Smith,et al.  NFS Version 3: Design and Implementation , 1994, USENIX Summer.

[4]  Udi Manber,et al.  Finding Similar Files in a Large File System , 1994, USENIX Winter.

[5]  Hector Garcia-Molina,et al.  Copy detection mechanisms for digital documents , 1995, SIGMOD '95.

[6]  David Wetherall,et al.  A protocol-independent technique for eliminating redundant network traffic , 2000, SIGCOMM 2000.

[7]  David Wetherall,et al.  A protocol-independent technique for eliminating redundant network traffic , 2000, SIGCOMM.

[8]  Li Fan,et al.  Summary cache: a scalable wide-area web cache sharing protocol , 2000, TNET.

[9]  MaziéresDavid,et al.  A low-bandwidth network file system , 2001 .

[10]  David Mazières,et al.  A low-bandwidth network file system , 2001, SOSP.

[11]  Monica S. Lam,et al.  Optimizing the migration of virtual computers , 2002, OPSR.

[12]  Sean Quinlan,et al.  Venti: A New Approach to Archival Storage , 2002, FAST.

[13]  Miguel Castro,et al.  Farsite: federated, available, and reliable storage for an incompletely trusted environment , 2002, OPSR.

[14]  Eric A. Brewer,et al.  Value-based web caching , 2003, WWW '03.

[15]  Mahadev Satyanarayanan,et al.  Opportunistic Use of Content Addressable Storage for Distributed File Systems , 2003, USENIX Annual Technical Conference, General Track.

[16]  Andrei Broder,et al.  Network Applications of Bloom Filters: A Survey , 2004, Internet Math..

[17]  Terence Kelly,et al.  Design, Implementation, and Evaluation of Duplicate Transfer Detection in HTTP , 2004, NSDI.

[18]  Fred Douglis,et al.  Redundancy Elimination Within Large Collections of Files , 2004, USENIX Annual Technical Conference, General Track.

[19]  Darrell D. E. Long,et al.  Deep Store: an archival storage system architecture , 2005, 21st International Conference on Data Engineering (ICDE'05).

[20]  Michael Dahlin,et al.  TAPER: tiered approach for eliminating redundancy in replica synchronization , 2005, FAST'05.