Fused Data Structures for Handling Multiple Faults in Distributed Systems

The paper describes a technique to correct crash faults in large data structures hosted on distributed servers, based on the concept of fused backups. The prevalent solution to this problem is replication. To correct f crash faults among n distinct data structures, replication requires nf additional replicas. If each of the primaries contains O(m) nodes of O(s) size each, this translates to O(nmsf) total backup space. Our technique uses a combination of erasure correcting codes and selective replication to correct f crash faults using just f additional backups consuming O(msf) total backup space, while incurring minimal overhead during normal operation. Since the data is maintained in the coded form, recovery is costly as compared to replication. However, in a system with infrequent faults, the savings in space outweighs the cost of recovery. We explore the theory and algorithms for these fused backups and provide a library of such backups for all the data structures in the Java 6 Collection framework. Our experimental evaluation confirms that fused backups are space-efficient as compared to replication (almost n times), while they cause very little overhead for updates. Many real world distributed systems such as Amazon's Dynamo data store use replication to achieve reliability. An alternate, fusion-based design can result in significant savings in space as well as other resources such as power.

[1]  Michael O. Rabin,et al.  Efficient dispersal of information for security, load balancing, and fault tolerance , 1989, JACM.

[2]  Michael Luby,et al.  A digital fountain approach to reliable distribution of bulk data , 1998, SIGCOMM '98.

[3]  Elwyn R. Berlekamp,et al.  Algebraic coding theory , 1984, McGraw-Hill series in systems science.

[4]  James S. Plank A tutorial on Reed-Solomon coding for fault-tolerance in RAID-like systems , 1997 .

[5]  Leslie Lamport,et al.  The Implementation of Reliable Distributed Multiprocess Systems , 1978, Comput. Networks.

[6]  Keith Marzullo,et al.  Comparing primary-backup and state machines for crash failures , 1996, PODC '96.

[7]  Randy H. Katz,et al.  A case for redundant arrays of inexpensive disks (RAID) , 1988, SIGMOD '88.

[8]  Vijay K. Garg,et al.  Fusible Data Structures for Fault-Tolerance , 2007, ICDCS.

[9]  H. B. Mann Error-Correcting Codes , 1972 .

[10]  Fred B. Schneider,et al.  Implementing fault-tolerant services using the state machine approach: a tutorial , 1990, CSUR.

[11]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[12]  J. H. van Lint,et al.  Introduction to Coding Theory , 1982 .

[13]  Vijay K. Garg,et al.  A fusion-based approach for tolerating faults in finite state machines , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[14]  Fred B. Schneider,et al.  Byzantine generals in action: implementing fail-stop processors , 1984, TOCS.

[15]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[16]  Daniel A. Spielman,et al.  Practical loss-resilient codes , 1997, STOC '97.

[17]  Garth A. Gibson,et al.  RAID: high-performance, reliable secondary storage , 1994, CSUR.

[18]  Vijay K. Garg,et al.  Fault Tolerance in Distributed Systems Using Fused Data Structures , 2013, IEEE Transactions on Parallel and Distributed Systems.

[19]  Leslie Lamport,et al.  Reaching Agreement in the Presence of Faults , 1980, JACM.

[20]  Vijay K. Garg Implementing Fault-Tolerant Services Using State Machines: Beyond Replication , 2010, DISC.