论文信息 - Low-Profile Source-side Deduplication for Virtual Machine Backup

Low-Profile Source-side Deduplication for Virtual Machine Backup

This paper presents a source-side backup scheme with low-resource usage through collaborative deduplication and approximated lazy deletion when frequent virtual machine snapshot backup is required in a large-scale cloud cluster. The key ideas are to orchestrate multiround duplicate detection batches among machines in a partitioned asynchronous manner and remove most unreferenced content chunks with approximated snapshot deletion. This paper discusses the challenges, main design and strategies, and evaluation results.

Wei Zhang | Tao Yang | Daniel Agun

[1] Yucheng Zhang,et al. Design Tradeoffs for Data Deduplication Performance in Backup Workloads , 2015, FAST.

[2] Irfan Ahmad,et al. Decentralized Deduplication in SAN Cluster File Systems , 2009, USENIX Annual Technical Conference.

[3] Kai Li,et al. Tradeoffs in Scalable Data Routing for Deduplication Clusters , 2011, FAST.

[4] Timothy Bisson,et al. iDedup: latency-aware, inline data deduplication for primary storage , 2012, FAST.

[5] Mark Lillibridge,et al. Sparse Indexing: Large Scale, Inline Deduplication Using Sampling and Locality , 2009, FAST.

[6] Wei Zhang,et al. VM-centric snapshot deduplication for cloud data backup , 2015, 2015 31st Symposium on Mass Storage Systems and Technologies (MSST).

[7] Hong Jiang,et al. Application-Aware Local-Global Source Deduplication for Cloud Backup Services of Personal Storage , 2014, IEEE Transactions on Parallel and Distributed Systems.

[8] Mark Lillibridge,et al. Extreme Binning: Scalable, parallel deduplication for chunk-based file backup , 2009, 2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems.

[9] Hong Jiang,et al. SAM: A Semantic-Aware Multi-tiered Source De-duplication Framework for Cloud Backup , 2010, 2010 39th International Conference on Parallel Processing.

[10] Michael Vrable,et al. Cumulus: Filesystem backup to the cloud , 2009, TOS.

[11] Philip Shilane,et al. Memory efficient sanitization of a deduplicated storage system , 2013, FAST.

[12] Petros Efstathopoulos,et al. Building a High-performance Deduplication System , 2011, USENIX Annual Technical Conference.

[13] Cezary Dubnicki,et al. Concurrent deletion in a distributed content-addressable storage system with global deduplication , 2013, FAST.

[14] Wei Zhang,et al. Low-Cost Data Deduplication for Virtual Machine Backup in Cloud Storage , 2013, HotStorage.

[15] Kai Li,et al. Avoiding the Disk Bottleneck in the Data Domain Deduplication File System , 2008, FAST.

[16] Hao Jiang,et al. Multi-level Selective Deduplication for VM Snapshots in Cloud Storage , 2012, 2012 IEEE Fifth International Conference on Cloud Computing.

[17] Sean Quinlan,et al. Venti: A New Approach to Archival Storage , 2002, FAST.