RECODE: Reconfigurable, consistent and decentralized data services

Key-based routing schemes, where a message is forwarded towards a server responsible for a partition in a large name space, does not provide strong delivery guarantees when the network is reconfigured with servers joining and leaving. This best-effort behavior is sufficient for eventually consistent data services such as key-value stores, content distribution networks or publish/subscribe systems. However, such schemes are not able to provide stronger consistency guarantees as required by, for example, metadata services and databases. We present RECODE, a framework for reconfigurable, consistent and decentralized data services. RECODE simplifies the implementation of strongly consistent data services, and continues to provide strong guarantees even during reconfiguration. More specifically, we introduce the routecast primitive which delivers messages for a key in the same total order, independent of the servers responsible for the key.We demonstrate the expressiveness and practical usability of RECODE by presenting three applications: a map of atomic registers, a set of distributed counters, and a lease management system. We evaluate the performance and elasticity of RECODE executing in a cluster.

[1]  Flavio Paiva Junqueira,et al.  Zab: High-performance broadcast for primary-backup systems , 2011, 2011 IEEE/IFIP 41st International Conference on Dependable Systems & Networks (DSN).

[2]  Leslie Lamport,et al.  Paxos Made Simple , 2001 .

[3]  Nancy A. Lynch,et al.  Atomic Data Access in Distributed Hash Tables , 2002, IPTPS.

[4]  Florian Schintke,et al.  Range queries on structured overlay networks , 2008, Comput. Commun..

[5]  Hans-Arno Jacobsen,et al.  PNUTS: Yahoo!'s hosted data serving platform , 2008, Proc. VLDB Endow..

[6]  Marcos K. Aguilera,et al.  Using the Heartbeat Failure Detector for Quiescent Reliable Communication and Consensus in Partitionable Networks , 1999, Theor. Comput. Sci..

[7]  Fernando Pedone,et al.  High performance state-machine replication , 2011, 2011 IEEE/IFIP 41st International Conference on Dependable Systems & Networks (DSN).

[8]  Seif Haridi,et al.  Key-based consistency and availability in structured overlay networks , 2008, HPDC '08.

[9]  André Schiper Dynamic group communication , 2005, Distributed Computing.

[10]  Flaviu Cristian,et al.  The Timed Asynchronous Distributed System Model , 1999, IEEE Trans. Parallel Distributed Syst..

[11]  John Risson Reliable key-based routing topologies , 2007 .

[12]  Mahadev Konar,et al.  ZooKeeper: Wait-free Coordination for Internet-scale Systems , 2010, USENIX ATC.

[13]  Felix Hupfeld,et al.  Flease - Lease Coordination Without a Lock Server , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[14]  Eugenio Cesario,et al.  The XtreemFS architecture—a case for object‐based file systems in Grids , 2008, Concurr. Comput. Pract. Exp..

[15]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[16]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[17]  Robert Griesemer,et al.  Paxos made live: an engineering perspective , 2007, PODC '07.

[18]  Fred B. Schneider,et al.  Implementing fault-tolerant services using the state machine approach: a tutorial , 1990, CSUR.

[19]  Jon Howell,et al.  The SMART way to migrate replicated stateful services , 2006, EuroSys.

[20]  Rachid Guerraoui,et al.  Introduction to reliable distributed programming , 2006 .

[21]  Alec Wolman,et al.  Centrifuge: Integrated Lease Management and Partitioning for Cloud Services , 2010, NSDI.

[22]  Margo I. Seltzer,et al.  Distributed, secure load balancing with skew, heterogeneity and churn , 2005, Proceedings IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies..

[23]  Leslie Lamport,et al.  The part-time parliament , 1998, TOCS.

[24]  Ali Ghodsi,et al.  Distributed k-ary System: Algorithms for Distributed Hash Tables , 2006 .

[25]  Nancy A. Lynch,et al.  Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services , 2002, SIGA.

[26]  Sean Quinlan,et al.  GFS: Evolution on Fast-forward , 2009, ACM Queue.

[27]  Brett D. Fleisch,et al.  The Chubby lock service for loosely-coupled distributed systems , 2006, OSDI '06.

[28]  Ivan Beschastnikh,et al.  Scalable consistency in Scatter , 2011, SOSP.

[29]  GhemawatSanjay,et al.  The Google file system , 2003 .

[30]  Kenneth P. Birman,et al.  The process group approach to reliable distributed computing , 1992, CACM.