ClusterSR: Cluster-Aware Scattered Repair in Erasure-Coded Storage

Erasure coding is a storage-efficient means to guarantee data reliability in today’s commodity storage systems, yet its repair performance is seriously hindered by the substantial repair traffic. Repair in clustered storage systems is even complicated because of the scarcity of the cross-cluster bandwidth. We present ClusterSR, a cluster-aware scattered repair approach. ClusterSR minimizes the cross-cluster repair traffic by carefully choosing the clusters for reading and repairing chunks. It further balances the cross-cluster repair traffic by scheduling the repair of multiple chunks. Large-scale simulation and Alibaba Cloud ECS experiments show that ClusterSR can reduce 6.7-52.7% of the cross-cluster repair traffic and improve 14.1-68.8% of the repair throughput.

[1]  Cory Hill,et al.  f4: Facebook's Warm BLOB Storage System , 2014, OSDI.

[2]  Anand Raghunathan,et al.  ShuffleWatcher: Shuffle-aware Scheduling in Multi-tenant MapReduce Clusters , 2014, USENIX Annual Technical Conference.

[3]  Cheng Huang,et al.  Erasure Coding in Windows Azure Storage , 2012, USENIX Annual Technical Conference.

[4]  Srinivasan Seshan,et al.  Scheduling techniques for hybrid circuit/packet networks , 2015, CoNEXT.

[5]  Kenneth W. Shum,et al.  Rack-Aware Regenerating Codes for Data Centers , 2019, IEEE Transactions on Information Theory.

[6]  GhemawatSanjay,et al.  The Google file system , 2003 .

[7]  Cheng Huang,et al.  Giza: Erasure Coding Objects across Global Data Centers , 2017, USENIX Annual Technical Conference.

[8]  Cheng Huang,et al.  Rethinking erasure codes for cloud file systems: minimizing I/O for recovery and degraded reads , 2012, FAST.

[9]  Patrick P. C. Lee,et al.  Repair Pipelining for Erasure-Coded Storage , 2017, USENIX Annual Technical Conference.

[10]  Robert Mateescu,et al.  Opening the Chrysalis: On the Real Repair Performance of MSR Codes , 2016, FAST.

[11]  Jiwu Shu,et al.  Reconsidering Single Failure Recovery in Clustered File Systems , 2016, 2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[12]  Kannan Ramchandran,et al.  A “Hitchhiker’s” Guide to Fast and Efficient Data Reconstruction in Erasure-coded Data Centers , 2014 .

[13]  Patrick P. C. Lee,et al.  Cross-Rack-Aware Updates in Erasure-Coded Data Centers , 2018, ICPP.

[14]  Dan Feng,et al.  Optimal Repair Layering for Erasure-Coded Data Centers , 2017, ACM Trans. Storage.

[15]  Dimitris S. Papailiopoulos,et al.  XORing Elephants: Novel Erasure Codes for Big Data , 2013, Proc. VLDB Endow..

[16]  Daniel P. Siewiorek,et al.  Architectures and algorithms for on-line failure recovery in redundant disk arrays , 1994, Distributed and Parallel Databases.

[17]  Patrick P. C. Lee,et al.  Fast Predictive Repair in Erasure-Coded Storage , 2019, 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[18]  John Kubiatowicz,et al.  Erasure Coding Vs. Replication: A Quantitative Comparison , 2002, IPTPS.

[19]  Saurabh Bagchi,et al.  Partial-parallel-repair (PPR): a distributed technique for repairing erasure coded storage , 2016, EuroSys.

[20]  Zhenhua Liu,et al.  HUG: Multi-Resource Fairness for Correlated and Elastic Demands , 2016, NSDI.

[21]  Mendel Rosenblum,et al.  Fast crash recovery in RAMCloud , 2011, SOSP.

[22]  Itzhak Tamo,et al.  A Family of Optimal Locally Recoverable Codes , 2013, IEEE Transactions on Information Theory.

[23]  Alexandros G. Dimakis,et al.  Network Coding for Distributed Storage Systems , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[24]  Srikanth Kandula,et al.  Leveraging endpoint flexibility in data-intensive clusters , 2013, SIGCOMM.

[25]  Van-Anh Truong,et al.  Availability in Globally Distributed Storage Systems , 2010, OSDI.

[26]  Albert G. Greenberg,et al.  VL2: a scalable and flexible data center network , 2009, SIGCOMM '09.

[27]  John C. S. Lui,et al.  Performance Analysis of Disk Arrays under Failure , 1990, VLDB.

[28]  F. Moore,et al.  Polynomial Codes Over Certain Finite Fields , 2017 .

[29]  Sriram Rao,et al.  A The Quantcast File System , 2013, Proc. VLDB Endow..

[30]  Minyi Guo,et al.  Approximate Code: A Cost-Effective Erasure Coding Framework for Tiered Video Storage in Cloud Systems , 2019, ICPP.