Flease - Lease Coordination Without a Lock Server

Large-scale distributed systems often require scalable and fault-tolerant mechanisms to coordinate exclusive access to shared resources such as files, replicas or the primary role. The best known algorithms to implement distributed mutual exclusion with leases, such as Multipaxos, are complex, difficult to implement, and rely on stable storage to persist lease information. In this paper we present {\bf F}LEASE, an algorithm for fault-tolerant lease coordination in distributed systems that is simpler than Multipaxos and does not rely on stable storage. The evaluation shows that {\bf F}LEASE can be used to implement scalable, decentralized lease coordination that outperforms a central lock service implementation by an order of magnitude.

[1]  Rachid Guerraoui,et al.  Asynchronous leasing , 2002, Proceedings of the Seventh IEEE International Workshop on Object-Oriented Real-Time Dependable Systems. (WORDS 2002).

[2]  Leslie Lamport,et al.  Lower bounds for asynchronous consensus , 2006, Distributed Computing.

[3]  Alan Robertson Linux-HA Heartbeat System Design , 2000, Annual Linux Showcase & Conference.

[4]  Daniel J. Abadi,et al.  The case for determinism in database systems , 2010, Proc. VLDB Endow..

[5]  Robert Griesemer,et al.  Paxos made live: an engineering perspective , 2007, PODC '07.

[6]  Robert M. Rees,et al.  IBM Storage Tank - A heterogeneous scalable SAN file system , 2003, IBM Syst. J..

[7]  Frank B. Schmuck,et al.  GPFS: A Shared-Disk File System for Large Computing Clusters , 2002, FAST.

[8]  Michael Williams,et al.  Replication in the harp file system , 1991, SOSP '91.

[9]  Sean Quinlan,et al.  GFS: Evolution on Fast-forward , 2009, ACM Queue.

[10]  Jon Howell,et al.  Distributed directory service in the Farsite file system , 2006, OSDI '06.

[11]  Eugenio Cesario,et al.  XtreemFS: a case for object-based storage in Grid data management , 2007 .

[12]  Rachid Guerraoui,et al.  Deconstructing paxos , 2003, SIGA.

[13]  S.A. Brandt,et al.  CRUSH: Controlled, Scalable, Decentralized Placement of Replicated Data , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[14]  Fred B. Schneider,et al.  The primary-backup approach , 1993 .

[15]  David R. Cheriton,et al.  Leases: an efficient fault-tolerant mechanism for distributed file cache consistency , 1989, SOSP '89.

[16]  Michael Dahlin,et al.  Volume Leases for Consistency in Large-Scale Systems , 1999, IEEE Trans. Knowl. Data Eng..

[17]  Dahlia Malkhi,et al.  Light-Weight Leases for Storage-Centric Coordination , 2006, International Journal of Parallel Programming.

[18]  Alec Wolman,et al.  Centrifuge: Integrated Lease Management and Partitioning for Cloud Services , 2010, NSDI.

[19]  GhemawatSanjay,et al.  The Google file system , 2003 .

[20]  Butler W. Lampson,et al.  How to Build a Highly Available System Using Consensus , 1996, WDAG.

[21]  Jacob R. Lorch,et al.  Farsite: federated, available, and reliable storage for an incompletely trusted environment , 2002, OSDI '02.

[22]  Carlos Maltzahn,et al.  Ceph: a scalable, high-performance distributed file system , 2006, OSDI '06.

[23]  André Schiper,et al.  From set membership to group membership: a separation of concerns , 2006, IEEE Transactions on Dependable and Secure Computing.

[24]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[25]  Toni Cortes,et al.  FaTLease: scalable fault-tolerant lease negotiation with Paxos , 2008, HPDC '08.

[26]  J. D. Day,et al.  A principle for resilient sharing of distributed resources , 1976, ICSE '76.

[27]  Nancy A. Lynch,et al.  Revisiting the PAXOS algorithm , 1997, Theor. Comput. Sci..

[28]  Flaviu Cristian,et al.  The Timed Asynchronous Distributed System Model , 1999, IEEE Trans. Parallel Distributed Syst..

[29]  Carlos Maltzahn,et al.  Grid resource management - CRUSH: controlled, scalable, decentralized placement of replicated data , 2006, SC.

[30]  Chandramohan A. Thekkath,et al.  Frangipani: a scalable distributed file system , 1997, SOSP.

[31]  Robert Tappan Morris,et al.  Flexible, Wide-Area Storage for Distributed Systems with WheelFS , 2009, NSDI.

[32]  Lidong Zhou,et al.  Niobe: A practical replication protocol , 2008, TOS.

[33]  Leslie Lamport,et al.  The part-time parliament , 1998, TOCS.

[34]  Leslie Lamport,et al.  Paxos Made Simple , 2001 .