Archiving cold data in warehouses with clustered network coding

Modern storage systems now typically combine plain replication and erasure codes to reliably store large amount of data in datacenters. Plain replication allows a fast access to popular data, while erasure codes, e.g., Reed-Solomon codes, provide a storage-efficient alternative for archiving less popular data. Although erasure codes are now increasingly employed in real systems, they experience high overhead during maintenance, i.e., upon failures, typically requiring files to be decoded before being encoded again to repair the encoded blocks stored at the faulty node. In this paper, we propose a novel erasure code system, tailored for networked archival systems. The efficiency of our approach relies on the joint use of random codes and a clustered placement strategy. Our repair protocol leverages network coding techniques to reduce by 50% the amount of data transferred during maintenance, by repairing several cluster files simultaneously. We demonstrate both through an analysis and extensive experimental study conducted on a public testbed that our approach significantly decreases both the bandwidth overhead during the maintenance process and the time to repair lost data. We also show that using a non-systematic code does not impact the throughput, and comes only at the price of a higher CPU usage. Based on these results, we evaluate the impact of this higher CPU consumption on different configurations of data coldness by determining whether the cluster's network bandwidth dedicated to repair or CPU dedicated to decoding saturates first.

[1]  Ernst W. Biersack,et al.  Hierarchical Codes: How to Make Erasure Codes Attractive for Peer-to-Peer Storage Systems , 2008, 2008 Eighth International Conference on Peer-to-Peer Computing.

[2]  Cheng Huang,et al.  Rethinking erasure codes for cloud file systems: minimizing I/O for recovery and degraded reads , 2012, FAST.

[3]  Alexandros G. Dimakis,et al.  Network Coding for Distributed Storage Systems , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[4]  Rudolf Ahlswede,et al.  Network information flow , 2000, IEEE Trans. Inf. Theory.

[5]  Frédérique Oggier,et al.  Self-repairing homomorphic codes for distributed storage systems , 2010, 2011 Proceedings IEEE INFOCOM.

[6]  Ethan L. Miller,et al.  Screaming fast Galois field arithmetic using intel SIMD instructions , 2013, FAST.

[7]  Ethan L. Miller,et al.  Screaming Fast Galois Field Arithmetic Using Intel SIMD Extensions , 2013 .

[8]  Rodrigo Rodrigues,et al.  High Availability in DHTs: Erasure Coding vs. Replication , 2005, IPTPS.

[9]  Michele Amoretti,et al.  Randomized network coding in distributed storage systems with layered overlay , 2011, 2011 Information Theory and Applications Workshop.

[10]  Christos Gkantsidis,et al.  Network coding for large scale content distribution , 2005, Proceedings IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies..

[11]  Patrick P. C. Lee,et al.  NCFS: On the Practicality and Extensibility of a Network-Coding-Based Distributed File System , 2011, 2011 International Symposium on Networking Coding.

[12]  Dorian Mazauric,et al.  Data Life Time for Different Placement Policies in P2P Storage Systems , 2010, Globe.

[13]  Kashi Venkatesh Vishwanath,et al.  Characterizing cloud computing hardware reliability , 2010, SoCC '10.

[14]  Ernst W. Biersack,et al.  A Practical Study of Regenerating Codes for Peer-to-Peer Backup Systems , 2009, 2009 29th IEEE International Conference on Distributed Computing Systems.

[15]  Anne-Marie Kermarrec,et al.  Availability-Based Methods for Distributed Storage Systems , 2012, 2012 IEEE 31st Symposium on Reliable Distributed Systems.

[16]  Cheng Huang,et al.  Erasure Coding in Windows Azure Storage , 2012, USENIX Annual Technical Conference.

[17]  Geoffrey M. Voelker,et al.  On Object Maintenance in Peer-to-Peer Systems , 2006, IPTPS.

[18]  Scott Shenker,et al.  Minimizing churn in distributed systems , 2006, SIGCOMM.

[19]  Van-Anh Truong,et al.  Availability in Globally Distributed Storage Systems , 2010, OSDI.

[20]  Andreas Haeberlen,et al.  Efficient Replica Maintenance for Distributed Storage Systems , 2006, NSDI.

[21]  Stefan Savage,et al.  Total Recall: System Support for Automated Availability Management , 2004, NSDI.

[22]  Baochun Li,et al.  Priority Random Linear Codes in Distributed Storage Systems , 2009, IEEE Transactions on Parallel and Distributed Systems.

[23]  Yunnan Wu,et al.  A Survey on Network Codes for Distributed Storage , 2010, Proceedings of the IEEE.

[24]  Catherine D. Schuman,et al.  A Performance Evaluation and Examination of Open-Source Erasure Coding Libraries for Storage , 2009, FAST.

[25]  Ben Y. Zhao,et al.  OceanStore: an architecture for global-scale persistent storage , 2000, SIGP.

[26]  Muriel Medard,et al.  How good is random linear coding based distributed networked storage , 2005 .

[27]  Dimitris S. Papailiopoulos,et al.  Simple regenerating codes: Network coding for cloud storage , 2011, 2012 Proceedings IEEE INFOCOM.

[28]  Antony I. T. Rowstron,et al.  Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility , 2001, SOSP.

[29]  Matei Ripeanu,et al.  Exploring data reliability tradeoffs in replicated storage systems , 2009, HPDC '09.

[30]  David R. Karger,et al.  Wide-area cooperative storage with CFS , 2001, SOSP.

[31]  Dimitris S. Papailiopoulos,et al.  XORing Elephants: Novel Erasure Codes for Big Data , 2013, Proc. VLDB Endow..

[32]  Kannan Ramchandran,et al.  A Solution to the Network Challenges of Data Recovery in Erasure-coded Distributed Storage Systems: A Study on the Facebook Warehouse Cluster , 2013, HotStorage.

[33]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[34]  Anne-Marie Kermarrec,et al.  Regenerating Codes: A System Perspective , 2012, SRDS.

[35]  GhemawatSanjay,et al.  The Google file system , 2003 .

[36]  John Kubiatowicz,et al.  Erasure Coding Vs. Replication: A Quantitative Comparison , 2002, IPTPS.

[37]  Minghua Chen,et al.  Pyramid Codes: Flexible Schemes to Trade Space for Access Efficiency in Reliable Data Storage Systems , 2007, Sixth IEEE International Symposium on Network Computing and Applications (NCA 2007).

[38]  Vinod M. Prabhakaran,et al.  Decentralized erasure codes for distributed networked storage , 2006, IEEE Transactions on Information Theory.

[39]  Taoufik En-Najjary,et al.  Proactive replication in distributed storage systems using machine availability estimation , 2007, CoNEXT '07.

[40]  Ju Wang,et al.  Windows Azure Storage: a highly available cloud storage service with strong consistency , 2011, SOSP.

[41]  Anne-Marie Kermarrec,et al.  Repairing Multiple Failures with Coordinated and Adaptive Regenerating Codes , 2011, 2011 International Symposium on Networking Coding.

[42]  Yang Tang,et al.  NCCloud: applying network coding for the storage repair in a cloud-of-clouds , 2012, FAST.

[43]  Nāgārjuna,et al.  A Secure Erasure Code-Based Cloud Storage System with Secure Data Forwarding , 2014 .