Fast crash recovery in RAMCloud

RAMCloud is a DRAM-based storage system that provides inexpensive durability and availability by recovering quickly after crashes, rather than storing replicas in DRAM. RAMCloud scatters backup data across hundreds or thousands of disks, and it harnesses hundreds of servers in parallel to reconstruct lost data. The system uses a log-structured approach for all its data, in DRAM as well as on disk: this provides high performance both during normal operation and during recovery. RAMCloud employs randomized techniques to manage the system in a scalable and decentralized fashion. In a 60-node cluster, RAMCloud recovers 35 GB of data from a failed server in 1.6 seconds. Our measurements suggest that the approach will scale to recover larger memory sizes (64 GB or more) in less time with larger clusters.

[1]  Michael Stonebraker,et al.  Implementation techniques for main memory database systems , 1984, SIGMOD '84.

[2]  Randy H. Katz,et al.  A case for redundant arrays of inexpensive disks (RAID) , 1988, SIGMOD '88.

[3]  Andrew R. Cherenson,et al.  The Sprite network operating system , 1988, Computer.

[4]  Maurice Herlihy,et al.  Linearizability: a correctness condition for concurrent objects , 1990, TOPL.

[5]  Hector Garcia-Molina,et al.  Main Memory Database Systems: An Overview , 1992, IEEE Trans. Knowl. Data Eng..

[6]  Mendel Rosenblum,et al.  The design and implementation of a log-structured file system , 1991, SOSP '91.

[7]  Eli Upfal,et al.  Balanced allocations (extended abstract) , 1994, STOC '94.

[8]  David J. Goodman,et al.  Personal Communications , 1994, Mobile Communications.

[9]  Sara McMains,et al.  File System Logging versus Clustering: A Performance Comparison , 1995, USENIX.

[10]  Michael Mitzenmacher,et al.  The Power of Two Choices in Randomized Load Balancing , 2001, IEEE Trans. Parallel Distributed Syst..

[11]  Robert Morris,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM 2001.

[12]  GhemawatSanjay,et al.  The Google file system , 2003 .

[13]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[14]  Michael Stonebraker,et al.  H-store: a high-performance, distributed main memory transaction processing system , 2008, Proc. VLDB Endow..

[15]  Hans-Arno Jacobsen,et al.  PNUTS: Yahoo!'s hosted data serving platform , 2008, Proc. VLDB Endow..

[16]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[17]  Sinfonia: A new paradigm for building scalable distributed systems , 2009, TOCS.

[18]  Jeffrey Dean,et al.  Evolution and future directions of large-scale storage and computation systems at Google , 2010, SoCC '10.

[19]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[20]  Mahadev Konar,et al.  ZooKeeper: Wait-free Coordination for Internet-scale Systems , 2010, USENIX ATC.

[21]  Parag Agrawal,et al.  The case for RAMCloud , 2011, Commun. ACM.

[22]  Gade Krishna,et al.  A scalable peer-to-peer lookup protocol for Internet applications , 2012 .