© 2013 Long Kai EMPIRICAL STUDY OF UNSTABLE LEADERS IN PAXOS BY LONG KAI THESIS

This thesis studies the effect of unstable leaders in Paxos protocol. Paxos algorithm is one of the most popular solutions for distributed consensus, and is often used for building replicated state machines. Safety is guaranteed by Paxos algorithm regardless of various machine and communication failures. However, the liveness is compromised when multiple Paxos leaders exist at the same time. Also, despite the extensive literature in the field, implementing Paxos algorithm for practical systems is still non-trivial. This thesis first studies the implications of multiple Paxos leaders in practical systems and provides an optimization by using leases. A complete specification of classical Paxos protocol is provided. We evaluate our implementation and show the effect of unstable leaders in practical systems.

[1]  Yair Amir,et al.  Paxos for System Builders: an overview , 2008, LADIS '08.

[2]  Lorenzo Alvisi,et al.  The Paxos Register , 2007, 2007 26th IEEE International Symposium on Reliable Distributed Systems (SRDS 2007).

[3]  Robert Griesemer,et al.  Paxos made live: an engineering perspective , 2007, PODC '07.

[4]  Leslie Lamport,et al.  Lower bounds for asynchronous consensus , 2006, Distributed Computing.

[5]  Leslie Lamport,et al.  Cheap Paxos , 2004, International Conference on Dependable Systems and Networks, 2004.

[6]  Leslie Lamport,et al.  The part-time parliament , 1998, TOCS.

[7]  Nancy A. Lynch,et al.  Revisiting the PAXOS algorithm , 1997, Theor. Comput. Sci..

[8]  Butler W. Lampson,et al.  How to Build a Highly Available System Using Consensus , 1996, WDAG.

[9]  Andreas Reuter,et al.  Transaction Processing: Concepts and Techniques , 1992 .

[10]  Barbara Liskov,et al.  Practical uses of synchronized clocks in distributed systems , 1991, PODC '91.

[11]  Fred B. Schneider,et al.  Implementing fault-tolerant services using the state machine approach: a tutorial , 1990, CSUR.

[12]  David R. Cheriton,et al.  Leases: an efficient fault-tolerant mechanism for distributed file cache consistency , 1989, SOSP '89.

[13]  Nancy A. Lynch,et al.  Consensus in the presence of partial synchrony , 1988, JACM.

[14]  Nancy A. Lynch,et al.  Impossibility of distributed consensus with one faulty process , 1985, JACM.

[15]  J. T. Sims,et al.  The Byzantine Generals Problem , 1982, TOPL.

[16]  Dale Skeen,et al.  A Quorum-Based Commit Protocol , 1982, Berkeley Workshop.

[17]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[18]  Leslie Lamport,et al.  The Implementation of Reliable Distributed Multiprocess Systems , 1978, Comput. Networks.

[19]  J. Gray,et al.  The notions of consistency and predicate locks in a database system , 1976, CACM.

[20]  Leslie Lamport,et al.  Fast Paxos , 2006, Distributed Computing.

[21]  Brett D. Fleisch,et al.  The Chubby lock service for loosely-coupled distributed systems , 2006, OSDI '06.

[22]  R. Guerraoui,et al.  Deconstructing paxos , 2003, SIGA.

[23]  Leslie Lamport,et al.  Paxos Made Simple , 2001 .

[24]  Barbara Liskov,et al.  Viewstamped Replication: A New Primary Copy Method to Support Highly-Available Distributed Systems , 1999, PODC '88.

[25]  Seif Haridi,et al.  Distributed Algorithms , 1992, Lecture Notes in Computer Science.