Scalable consistency in Scatter

Distributed storage systems often trade off strong semantics for improved scalability. This paper describes the design, implementation, and evaluation of Scatter, a scalable and consistent distributed key-value storage system. Scatter adopts the highly decentralized and self-organizing structure of scalable peer-to-peer systems, while preserving linearizable consistency even under adverse circumstances. Our prototype implementation demonstrates that even with very short node lifetimes, it is possible to build a scalable and consistent system with practical performance.

[1]  Gade Krishna,et al.  A scalable peer-to-peer lookup protocol for Internet applications , 2012 .

[2]  Leslie Lamport,et al.  Reconfiguring a state machine , 2010, SIGA.

[3]  Fabián E. Bustamante,et al.  Friendships that Last: Peer Lifespan and its Role in P2P Protocols , 2003, WCW.

[4]  Robert Morris,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM 2001.

[5]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[6]  Jia Wang,et al.  Analyzing peer-to-peer traffic across large networks , 2002, IEEE/ACM Transactions on Networking.

[7]  Krishna P. Gummadi,et al.  Measurement, modeling, and analysis of a peer-to-peer file-sharing workload , 2003, SOSP '03.

[8]  Mahadev Konar,et al.  ZooKeeper: Wait-free Coordination for Internet-scale Systems , 2010, USENIX Annual Technical Conference.

[9]  Miguel Castro,et al.  Performance and dependability of structured peer-to-peer overlays , 2004, International Conference on Dependable Systems and Networks, 2004.

[10]  Brett D. Fleisch,et al.  The Chubby lock service for loosely-coupled distributed systems , 2006, OSDI '06.

[11]  Leslie Lamport,et al.  The part-time parliament , 1998, TOCS.

[12]  Robert Griesemer,et al.  Paxos made live: an engineering perspective , 2007, PODC '07.

[13]  Scott Shenker,et al.  Key Consistency in DHTs , 2005 .

[14]  Nancy A. Lynch,et al.  Atomic Data Access in Distributed Hash Tables , 2002, IPTPS.

[15]  David Mazières,et al.  Kademlia: A Peer-to-Peer Information System Based on the XOR Metric , 2002, IPTPS.

[16]  Hans-Arno Jacobsen,et al.  PNUTS: Yahoo!'s hosted data serving platform , 2008, Proc. VLDB Endow..

[17]  Stefan Saroiu,et al.  A Measurement Study of Peer-to-Peer File Sharing Systems , 2001 .

[18]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM '01.

[19]  Robert Morris,et al.  Etna: A Fault-tolerant Algorithm for Atomic Mutable DHT Data , 2005 .

[20]  Antony I. T. Rowstron,et al.  Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems , 2001, Middleware.

[21]  Fred B. Schneider,et al.  Implementing fault-tolerant services using the state machine approach: a tutorial , 1990, CSUR.

[22]  Butler W. Lampson,et al.  Crash Recovery in a Distributed Data Storage System , 1981 .

[23]  Barbara Liskov,et al.  Viewstamped Replication: A New Primary Copy Method to Support Highly-Available Distributed Systems , 1999, PODC '88.

[24]  Scott Shenker,et al.  Fixing the Embarrassing Slowness of OpenDHT on PlanetLab , 2005, WORLDS.

[25]  Michael Mitzenmacher,et al.  The Power of Two Choices in Randomized Load Balancing , 2001, IEEE Trans. Parallel Distributed Syst..

[26]  Jacky C. Chu,et al.  Availability and locality measurements of peer-to-peer file systems , 2002, SPIE ITCom.

[27]  Ion Stoica,et al.  Non-Transitive Connectivity and DHTs , 2005, WORLDS.

[28]  Peter Druschel,et al.  Pastry: Scalable, distributed object location and routing for large-scale peer-to- , 2001 .

[29]  John Kubiatowicz,et al.  Handling churn in a DHT , 2004 .

[30]  Yawei Li,et al.  Megastore: Providing Scalable, Highly Available Storage for Interactive Services , 2011, CIDR.

[31]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[32]  Jure Leskovec,et al.  Patterns of temporal variation in online media , 2011, WSDM '11.

[33]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[34]  Brighten Godfrey,et al.  OpenDHT: a public DHT service and its uses , 2005, SIGCOMM '05.

[35]  Jeffrey Dean MapReduce and Other Building Blocks for Large-Scale Distributed Systems at Google , 2007, USENIX Annual Technical Conference.

[36]  Ben Y. Zhao,et al.  OceanStore: an architecture for global-scale persistent storage , 2000, SIGP.

[37]  Thomas E. Anderson,et al.  Profiling a million user dht , 2007, IMC '07.