Using multi-threads to hide deduplication I/O latency with low synchronization overhead

Data deduplication, as a compression method, has been widely used in most backup systems to improve bandwidth and space efficiency. As data exploded to be backed up, two main challenges in data deduplication are the CPU-intensive chunking and hashing works and the I/O intensive disk-index access latency. However, CPU-intensive works have been vastly parallelized and speeded up by multi-core and many-core processors; the I/O latency is likely becoming the bottleneck in data deduplication. To alleviate the challenge of I/O latency in multi-core systems, multi-threaded deduplication (Multi-Dedup) architecture was proposed. The main idea of Multi-Dedup was using parallel deduplication threads to hide the I/O latency. A prefix based concurrent index was designed to maintain the internal consistency of the deduplication index with low synchronization overhead. On the other hand, a collisionless cache array was also designed to preserve locality and similarity within the parallel threads. In various real-world datasets experiments, Multi-Dedup achieves 3–5 times performance improvements incorporating with locality-based ChunkStash and local-similarity based SiLo methods. In addition, Multi-Dedup has dramatically decreased the synchronization overhead and achieves 1.5–2 times performance improvements comparing to traditional lock-based synchronization methods.

[1]  Sean Quinlan,et al.  Venti: A New Approach to Archival Storage , 2002, FAST.

[2]  Jianer Chen,et al.  Approximation algorithm for multiprocessor parallel job scheduling , 2002 .

[3]  Matei Ripeanu,et al.  A GPU accelerated storage system , 2010, HPDC '10.

[4]  Li Fan,et al.  Summary cache: a scalable wide-area web cache sharing protocol , 2000, TNET.

[5]  Jin Li,et al.  ChunkStash: Speeding Up Inline Storage Deduplication Using Flash Memory , 2010, USENIX Annual Technical Conference.

[6]  Jonathan Walpole,et al.  Resizable, Scalable, Concurrent Hash Tables via Relativistic Programming , 2011, USENIX ATC.

[7]  Akshat Verma,et al.  Shredder: GPU-accelerated incremental storage and computation , 2012, FAST.

[8]  Aleksey Pesterev,et al.  Fast, Inexpensive Content-Addressed Storage in Foundation , 2008, USENIX Annual Technical Conference.

[9]  Paul E. McKenney,et al.  READ-COPY UPDATE: USING EXECUTION HISTORY TO SOLVE CONCURRENCY PROBLEMS , 2002 .

[10]  Ian Pratt,et al.  Proceedings of the General Track: 2004 USENIX Annual Technical Conference , 2004 .

[11]  Petros Efstathopoulos,et al.  Building a High-performance Deduplication System , 2011, USENIX Annual Technical Conference.

[12]  Dongsheng Wang,et al.  A Novel Optimization Method to Improve De-duplication Storage System Performance , 2009, 2009 15th International Conference on Parallel and Distributed Systems.

[13]  Michal Kaczmarczyk,et al.  HYDRAstor: A Scalable Secondary Storage , 2009, FAST.

[14]  Kai Li,et al.  Avoiding the Disk Bottleneck in the Data Domain Deduplication File System , 2008, FAST.

[15]  Sean Matthew Dorward,et al.  Awarded Best Paper! - Venti: A New Approach to Archival Data Storage , 2002 .

[16]  Hong Jiang,et al.  SiLo: A Similarity-Locality based Near-Exact Deduplication Scheme with Low RAM Overhead and High Throughput , 2011, USENIX Annual Technical Conference.

[17]  Mark Lillibridge,et al.  Extreme Binning: Scalable, parallel deduplication for chunk-based file backup , 2009, 2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems.

[18]  Hong Jiang,et al.  Accelerating Data Deduplication by Exploiting Pipelining and Parallelism with Multicore or Manycore Processors , 2012 .

[19]  Abraham Lempel,et al.  A universal algorithm for sequential data compression , 1977, IEEE Trans. Inf. Theory.

[20]  Jonathan Walpole,et al.  Scalable concurrent hash tables via relativistic programming , 2010, OPSR.

[21]  David Mazières,et al.  A low-bandwidth network file system , 2001, SOSP.

[22]  Mark Lillibridge,et al.  Sparse Indexing: Large Scale, Inline Deduplication Using Sampling and Locality , 2009, FAST.

[23]  Udi Manber,et al.  Finding Similar Files in a Large File System , 1994, USENIX Winter.