RADON: Repairable Atomic Data Object in Networks

Erasure codes offer an efficient way to decrease storage and communication costs while implementing atomic memory service in asynchronous distributed storage systems. In this paper, we provide erasure-code-based algorithms having the additional ability to perform background repair of crashed nodes. A repair operation of a node in the crashed state is triggered externally, and is carried out by the concerned node via message exchanges with other active nodes in the system. Upon completion of repair, the node re-enters active state, and resumes participation in ongoing and future read, write, and repair operations. To guarantee liveness and atomicity simultaneously, existing works assume either the presence of nodes with stable storage, or presence of nodes that never crash during the execution. We demand neither of these; instead we consider a natural, yet practical network stability condition $N1$ that only restricts the number of nodes in the crashed/repair state during broadcast of any message. We present an erasure-code based algorithm $RADON_C$ that is always live, and guarantees atomicity as long as condition $N1$ holds. In situations when the number of concurrent writes is limited, $RADON_C$ has significantly improved storage and communication cost over a replication-based algorithm $RADON_R$, which also works under $N1$. We further show how a slightly stronger network stability condition $N2$ can be used to construct algorithms that never violate atomicity. The guarantee of atomicity comes at the expense of having an additional phase during the read and write operations.

[1]  Nancy A. Lynch,et al.  RAMBO: A Reconfigurable Atomic Memory Service for Dynamic Networks , 2002, DISC.

[2]  Idit Keidar,et al.  On Liveness of Dynamic Storage , 2015, SIROCCO.

[3]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[4]  Faith Ellen,et al.  Simulating a Shared Register in an Asynchronous System that Never Stops Changing - (Extended Abstract) , 2015, DISC.

[5]  Dimitris S. Papailiopoulos,et al.  XORing Elephants: Novel Erasure Codes for Big Data , 2013, Proc. VLDB Endow..

[6]  Jennifer L. Welch,et al.  Multiwriter Consistency Conditions for Shared Memory Registers , 2011, SIAM J. Comput..

[7]  Marcos K. Aguilera,et al.  Reconfiguring Replicated Atomic Storage: A Tutorial , 2013, Bull. EATCS.

[8]  Stefano Tessaro,et al.  Optimal Resilience for Erasure-Coded Byzantine Distributed Storage , 2005, International Conference on Dependable Systems and Networks (DSN'06).

[9]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[10]  Anne-Marie Kermarrec,et al.  Implementing a Register in a Dynamic Distributed System , 2009, 2009 29th IEEE International Conference on Distributed Computing Systems.

[11]  Nancy A. Lynch,et al.  Distributed Algorithms , 1992, Lecture Notes in Computer Science.

[12]  Leslie Lamport,et al.  Interprocess Communication , 2020, Practical System Programming with C.

[13]  Nancy A. Lynch,et al.  Storage-Optimized Data-Atomic Algorithms for Handling Erasures and Errors in Distributed Storage Systems , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[14]  Idit Keidar,et al.  Space Bounds for Reliable Storage: Fundamental Limits of Coding , 2016, PODC.

[15]  Rachid Guerraoui,et al.  The collective memory of amnesic processes , 2008, TALG.

[16]  Nancy A. Lynch,et al.  Efficient Replication of Large Data Objects , 2003, DISC.

[17]  A. Spiegelman,et al.  Dynamic Reconfiguration: A Tutorial∗ , 2016 .

[18]  Rachid Guerraoui,et al.  Optimistic Erasure-Coded Distributed Storage , 2008, DISC.

[19]  Marcos K. Aguilera,et al.  Dynamic atomic storage without consensus , 2009, PODC '09.

[20]  Peter M. Musial,et al.  Implementing distributed shared memory for dynamic networks , 2014, CACM.

[21]  Michael K. Reiter,et al.  Low-overhead byzantine fault-tolerant storage , 2007, SOSP.

[22]  Cheng Huang,et al.  Erasure Coding in Windows Azure Storage , 2012, USENIX Annual Technical Conference.

[23]  Nancy A. Lynch,et al.  A coded shared atomic memory algorithm for message passing architectures , 2014, 2014 IEEE 13th International Symposium on Network Computing and Applications.

[24]  Marcos K. Aguilera,et al.  Using erasure codes efficiently for storage in a distributed system , 2005, 2005 International Conference on Dependable Systems and Networks (DSN'05).

[25]  Hagit Attiya,et al.  Sharing memory robustly in message-passing systems , 1990, PODC '90.

[26]  W. Cary Huffman,et al.  Fundamentals of Error-Correcting Codes , 1975 .

[27]  Ghassan O. Karame,et al.  PoWerStore: proofs of writing for efficient and robust storage , 2012, CCS.

[28]  Luiz André Barroso,et al.  The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines , 2009, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines.

[29]  F. Moore,et al.  Polynomial Codes Over Certain Finite Fields , 2017 .

[30]  Yunnan Wu,et al.  A Survey on Network Codes for Distributed Storage , 2010, Proceedings of the IEEE.

[31]  Kannan Ramchandran,et al.  Having Your Cake and Eating It Too: Jointly Optimal Erasure Codes for I/O, Storage, and Network-bandwidth , 2015, FAST.