论文信息 - Towards Fast De-duplication Using Low Energy Coprocessor

Towards Fast De-duplication Using Low Energy Coprocessor

Backup technology based on data de-duplication has become a hot topic in nowadays. In order to get a better performance, traditional research is mainly focused on decreasing the disk access time. In this paper, we consider computing complexity problem in data de-duplication system, and try to improve system performance by reducing computing time. We put computing tasks on commodity coprocessor to speed up the computing process. Compared with general-purpose processors, commodity coprocessors have lower energy consumption and lower cost. Experimental results show that they have equal or even better performance compared with general-purpose processors.

[1] Mark Lillibridge,et al. Extreme Binning: Scalable, parallel deduplication for chunk-based file backup , 2009, 2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems.

[2] Sean Quinlan,et al. Venti: A New Approach to Archival Storage , 2002, FAST.

[3] Kai Li,et al. Avoiding the Disk Bottleneck in the Data Domain Deduplication File System , 2008, FAST.

[4] Hong Jiang,et al. DEBAR: A scalable high-performance de-duplication storage system for backup and archiving , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[5] Burton H. Bloom,et al. Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[6] Qing Yang,et al. TRAP-Array: A Disk Array Architecture Providing Timely Recovery to Any Point-in-time , 2006, ISCA 2006.

[7] Ke Zhou,et al. TSPSCDP: A Time-Stamp Continuous Data Protection Approach Based on Pipeline Strategy , 2008, 2008 Japan-China Joint Workshop on Frontier of Computer Science and Technology.

[8] Xu Li,et al. Optimal Implementation of Continuous Data Protection (CDP) in Linux Kernel , 2008, 2008 International Conference on Networking, Architecture, and Storage.

[9] Paula Ta-Shma,et al. Architectures for Controller Based CDP , 2007, FAST.

[10] Tzi-cker Chiueh,et al. An Incremental File System Consistency Checker for Block-Level CDP Systems , 2008, 2008 Symposium on Reliable Distributed Systems.

[11] Mark Lillibridge,et al. Sparse Indexing: Large Scale, Inline Deduplication Using Sampling and Locality , 2009, FAST.