Toward Optimal Storage Scaling via Network Coding: From Theory to Practice

To adapt to the increasing storage demands and varying storage redundancy requirements, practical distributed storage systems need to support storage scaling by relocating currently stored data to different storage nodes. However, the scaling process inevitably transfers substantial data traffic over the network. Thus, minimizing the bandwidth cost of the scaling process is critical in distributed settings. In this paper, we show that optimal storage scaling is achievable in erasure-coded distributed storage based on network coding, by allowing storage nodes to send encoded data during scaling. We formally prove the information-theoretically minimum scaling bandwidth. Based on our theoretical findings, we also build a distributed storage system prototype NCScale, which realizes network-coding-based scaling while preserving the necessary properties for practical deployment. Experiments on Amazon EC2 show that the scaling time can be reduced by up to 50% over the state-of-the-art.

[1]  Mario Blaum,et al.  A Tale of Two Erasure Codes in HDFS , 2015, FAST.

[2]  Dimitris S. Papailiopoulos,et al.  XORing Elephants: Novel Erasure Codes for Big Data , 2013, Proc. VLDB Endow..

[3]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[4]  Keqin Li,et al.  Accelerate RDP RAID-6 Scaling by Reducing Disk I/Os and XOR Operations , 2015, IEEE Transactions on Computers.

[5]  Chentao Wu,et al.  SDM: A Stripe-Based Data Migration Scheme to Improve the Scalability of RAID-6 , 2012, 2012 IEEE International Conference on Cluster Computing.

[6]  Marek Karpinski,et al.  An XOR-based erasure-resilient coding scheme , 1995 .

[7]  Chentao Wu,et al.  GSR: A Global Stripe-Based Redistribution Approach to Accelerate RAID-5 Scaling , 2012, 2012 41st International Conference on Parallel Processing.

[8]  Catherine D. Schuman,et al.  A Performance Evaluation and Examination of Open-Source Erasure Coding Libraries for Storage , 2009, FAST.

[9]  Rudolf Ahlswede,et al.  Network information flow , 2000, IEEE Trans. Inf. Theory.

[10]  Si Wu,et al.  I/O-Efficient Scaling Schemes for Distributed Storage Systems with CRS Codes , 2016, IEEE Transactions on Parallel and Distributed Systems.

[11]  Keqin Li,et al.  Rethinking RAID-5 Data Layout for Better Scalability , 2014, IEEE Transactions on Computers.

[12]  Jiwu Shu,et al.  SLAS: An efficient approach to scaling round-robin striped volumes , 2007, TOS.

[13]  Yunnan Wu,et al.  A Survey on Network Codes for Distributed Storage , 2010, Proceedings of the IEEE.

[14]  Garth A. Gibson,et al.  DiskReduce: RAID for data-intensive scalable computing , 2009, PDSW '09.

[15]  Chentao Wu,et al.  A Flexible Framework to Enhance RAID-6 Scalability via Exploiting the Similarities among MDS Codes , 2013, 2013 42nd International Conference on Parallel Processing.

[16]  James S. Plank A tutorial on Reed-Solomon coding for fault-tolerance in RAID-like systems , 1997 .

[17]  John Kubiatowicz,et al.  Erasure Coding Vs. Replication: A Quantitative Comparison , 2002, IPTPS.

[18]  Jiwu Shu,et al.  ALV: A New Data Redistribution Approach to RAID-5 Scaling , 2010, IEEE Transactions on Computers.

[19]  Xiao Qin,et al.  Scale-RS: An Efficient Scaling Scheme for RS-Coded Storage Clusters , 2015, IEEE Transactions on Parallel and Distributed Systems.

[20]  Jérôme Lacan,et al.  Systematic MDS erasure codes based on Vandermonde matrices , 2004, IEEE Communications Letters.

[21]  Van-Anh Truong,et al.  Availability in Globally Distributed Storage Systems , 2010, OSDI.

[22]  Cheng Huang,et al.  Erasure Coding in Windows Azure Storage , 2012, USENIX Annual Technical Conference.

[23]  Brijesh Kumar Rai,et al.  On adaptive distributed storage systems , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[24]  Weimin Zheng,et al.  FastScale: Accelerate RAID Scaling by Minimizing Data Migration , 2011, FAST.

[25]  Sriram Rao,et al.  A The Quantcast File System , 2013, Proc. VLDB Endow..

[26]  F. Moore,et al.  Polynomial Codes Over Certain Finite Fields , 2017 .

[27]  Alexandros G. Dimakis,et al.  Network Coding for Distributed Storage Systems , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[28]  Randy H. Katz,et al.  A case for redundant arrays of inexpensive disks (RAID) , 1988, SIGMOD '88.