A Simulation Analysis of Redundancy and Reliability in Primary Storage Deduplication

Deduplication has been widely used to improve storage efficiency in modern primary and secondary storage systems, yet how deduplication fundamentally affects storage system reliability remains debatable. This paper aims to analyze and compare storage system reliability with and without deduplication in primary workloads using public file system snapshots from two research groups. We first study the redundancy characteristics of the file system snapshots. We then propose a trace-driven, deduplication-aware simulation framework to analyze data loss in both chunk and file levels due to sector errors and whole-disk failures. Compared to without deduplication, our analysis shows that deduplication consistently reduces the damage of sector errors due to intra-file redundancy elimination, but potentially increases the damages of whole-disk failures if the highly referenced chunks are not carefully placed on disk. To improve reliability, we examine a deliberate copy technique that stores and repairs first the most referenced chunks in a small dedicated physical area (e.g., 1 percent of the physical capacity), and demonstrate its effectiveness through our simulation framework.

[1]  Dan Feng,et al.  A simulation analysis of reliability in primary storage deduplication , 2016, 2016 IEEE International Symposium on Workload Characterization (IISWC).

[2]  Mark Lillibridge,et al.  Sparse Indexing: Large Scale, Inline Deduplication Using Sampling and Locality , 2009, FAST.

[3]  Sean Quinlan,et al.  Venti: A New Approach to Archival Storage , 2002, FAST.

[4]  William H. Sanders,et al.  A framework for efficient evaluation of the fault tolerance of deduplicated storage systems , 2012, IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2012).

[5]  Fred Douglis,et al.  Characteristics of backup workloads in production systems , 2012, FAST.

[6]  Darrell D. E. Long,et al.  Providing High Reliability in a Minimum Redundancy Archival Storage System , 2006, 14th IEEE International Symposium on Modeling, Analysis, and Simulation.

[7]  Ethan L. Miller,et al.  HANDS: A heuristically arranged non-backup in-line deduplication system , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[8]  Xiaozhou Li,et al.  Reliability analysis of deduplicated and erasure-coded storage , 2011, PERV.

[9]  Michal Kaczmarczyk,et al.  HYDRAstor: A Scalable Secondary Storage , 2009, FAST.

[10]  Min Xu,et al.  Efficient Hybrid Inline and Out-of-Line Deduplication for Backup Storage , 2014, TOS.

[11]  J. Sikora Disk failures in the real world : What does an MTTF of 1 , 000 , 000 hours mean to you ? , 2007 .

[12]  Maohua Lu,et al.  Insights for data reduction in primary storage: a practical analysis , 2012, SYSTOR '12.

[13]  Rina Panigrahy,et al.  Design Tradeoffs for SSD Performance , 2008, USENIX ATC.

[14]  Andrei Z. Broder,et al.  On the resemblance and containment of documents , 1997, Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171).

[15]  Sudipta Sengupta,et al.  Primary Data Deduplication - Large Scale Study and System Design , 2012, USENIX Annual Technical Conference.

[16]  Irfan Ahmad,et al.  Decentralized Deduplication in SAN Cluster File Systems , 2009, USENIX Annual Technical Conference.

[17]  Fred Douglis,et al.  RAIDShield: Characterizing, Monitoring, and Proactively Protecting Against Disk Failures , 2015, FAST.

[18]  Mark Lillibridge,et al.  Improving restore speed for backup systems that use inline chunk-based deduplication , 2013, FAST.

[19]  Jiri Schindler,et al.  Beyond MTTDL: A Closed-Form RAID 6 Reliability Equation , 2014, TOS.

[20]  Shankar Pasupathy,et al.  An analysis of latent sector errors in disk drives , 2007, SIGMETRICS '07.

[21]  Dan Feng,et al.  Accelerating Restore and Garbage Collection in Deduplication-based Backup Systems via Exploiting Historical Information , 2014, USENIX Annual Technical Conference.

[22]  Erez Zadok,et al.  A long-term user-centric analysis of deduplication patterns , 2016, 2016 32nd Symposium on Mass Storage Systems and Technologies (MSST).

[23]  Timothy Bisson,et al.  iDedup: latency-aware, inline data deduplication for primary storage , 2012, FAST.

[24]  Dutch T. Meyer,et al.  A study of practical deduplication , 2011, TOS.

[25]  Chunyi Peng,et al.  An empirical analysis of similarity in virtual machine images , 2011, Middleware '11.

[26]  Michael G. Pecht,et al.  A Highly Accurate Method for Assessing Reliability of Redundant Arrays of Inexpensive Disks (RAID) , 2009, IEEE Transactions on Computers.

[27]  André Brinkmann,et al.  A study on data deduplication in HPC storage systems , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[28]  Mark Lillibridge,et al.  Extreme Binning: Scalable, parallel deduplication for chunk-based file backup , 2009, 2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems.

[29]  Bin Yan,et al.  R-ADMAD: high reliability provision for large-scale de-duplication archival storage systems , 2009, ICS '09.

[30]  Robert Ricci,et al.  Metadata Considered Harmful...to Deduplication , 2015, HotStorage.

[31]  Ethan L. Miller,et al.  The effectiveness of deduplication on virtual machine disk images , 2009, SYSTOR '09.

[32]  James S. Plank,et al.  Mean Time to Meaningless: MTTDL, Markov Models, and Storage System Reliability , 2010, HotStorage.

[33]  Arif Merchant,et al.  Reliability of nand-Based SSDs: What Field Studies Tell Us , 2017, Proceedings of the IEEE.

[34]  L. Vivier,et al.  The new ext 4 filesystem : current status and future plans , 2007 .

[35]  William H. Sanders,et al.  Modeling the Fault Tolerance Consequences of Deduplication , 2011, 2011 IEEE 30th International Symposium on Reliable Distributed Systems.

[36]  John C. S. Lui,et al.  Live Deduplication Storage of Virtual Machine Images in an Open-Source Cloud , 2011, Middleware.

[37]  Bianca Schroeder,et al.  Understanding latent sector errors and how to protect against them , 2010, TOS.

[38]  Eduardo Pinheiro,et al.  Failure Trends in a Large Disk Drive Population , 2007, FAST.

[39]  Philip Shilane,et al.  Memory efficient sanitization of a deduplicated storage system , 2013, FAST.