Combining abstraction with Byzantine fault-tolerance

This thesis describes a technique to build replicated services that combines Byzantine fault tolerance with work on abstract data types. Tolerating Byzantine faults is important because software errors are a major cause of outages and they can make faulty replicas behave arbitrarily. Abstraction hides implementation details to enable the reuse of existing service implementations and to improve the ability to mask software errors. We improve resilience to software errors by enabling the recovery of faulty replicas using state stored in replicas with distinct implementations; using an opportunistic N-version programming technique that runs distinct, off-the-shelf implementations at each replica to reduce the probability of common mode failures; and periodically repairing each replica using an abstract view of the state stored by the correct replicas in the group, which improves tolerance to faults due to software aging. We have built two replicated services that demonstrate the use of this technique. The first is an NFS service where each replica runs a different off-the-shelf file system implementation. The second is a replicated version of the Thor object-oriented database. In this case, the methodology enabled reuse of the existing database code, which is non-deterministic. These examples suggest that our technique can be used in practice: Our performance results show that the replicated systems perform comparably to their original, non-replicated versions. Furthermore, both implementations required only a modest amount of new code, which reduces the likelihood of introducing more errors and keeps the monetary cost of using our technique low. Thesis Supervisor: Barbara Liskov Title: Ford Professor of Engineering Para o meu av6 VItor Hugo o poeta J um fingidor Finge tdo completamente Que chega afingir que e dor A dor que deveras sente. Fernando Pessoa, Autopsicografia To my grandfather Vitor Hugo The poet is a pretender He pretends so absolutely He manages to pretend a pain From the real pain he feels. Fernando Pessoa, Self-Psychography

[1]  Thki Hder,et al.  OBSERVATIONS ON OPTIMISTIC CONCURRENCY CONTROL SCHEMES , 2003 .

[2]  Fred B. Schneider,et al.  Implementing fault-tolerant services using the state machine approach: a tutorial , 1990, CSUR.

[3]  David L. Mills,et al.  Network Time Protocol (version 1) specification and implementation , 1988, RFC.

[4]  Robert Gruber,et al.  Efficient optimistic concurrency control using loosely synchronized clocks , 1995, SIGMOD '95.

[5]  William I. Nowicki,et al.  NFS: Network File System Protocol specification , 1989, RFC.

[6]  Leslie Lamport,et al.  Reaching Agreement in the Presence of Faults , 1980, JACM.

[7]  Miguel Castro,et al.  Safe and efficient sharing of persistent objects in Thor , 1996, SIGMOD '96.

[8]  Barbara Liskov,et al.  Program Development in Java - Abstraction, Specification, and Object-Oriented Design , 1986 .

[9]  Mihir Bellare,et al.  A New Paradigm for Collision-Free Hashing: Incrementality at Reduced Cost , 1997, EUROCRYPT.

[10]  J CareyMichael,et al.  Fine-grained sharing in a page server OODBMS , 1994 .

[11]  Barbara Liskov,et al.  Collecting cyclic distributed garbage by controlled migration , 1997, PODC '95.

[12]  David J. DeWitt,et al.  The 007 Benchmark , 1993, SIGMOD '93.

[13]  Andreas Reuter,et al.  Transaction Processing: Concepts and Techniques , 1992 .

[14]  Hugo Krawczyk,et al.  UMAC: Fast and Secure Message Authentication , 1999, CRYPTO.

[15]  Seif Haridi,et al.  Distributed Algorithms , 1992, Lecture Notes in Computer Science.

[16]  Theo Härder,et al.  Observations on optimistic concurrency control schemes , 1984, Inf. Syst..

[17]  Miguel Castro,et al.  Providing Persistent Objects in Distributed Systems , 1999, ECOOP.

[18]  Brent Callaghan,et al.  NFS Illustrated , 1999 .

[19]  Dan Walsh,et al.  Design and implementation of the Sun network filesystem , 1985, USENIX Conference Proceedings.

[20]  Mahadev Satyanarayanan,et al.  Scale and performance in a distributed file system , 1987, SOSP '87.

[21]  Kishor S. Trivedi,et al.  Minimizing completion time of a program by checkpointing and rejuvenation , 1996, SIGMETRICS '96.

[22]  John K. Ousterhout,et al.  Why Aren't Operating Systems Getting Faster As Fast as Hardware? , 1990, USENIX Summer.

[23]  A. Adya Transaction Management for Mobile Objects Using Optimistic Concurrency Control , 1994 .

[24]  Daniel P. Siewiorek,et al.  High-availability computer systems , 1991, Computer.

[25]  Liuba Shrira,et al.  Distributed Object Management in Thor , 1992, IWDOM.

[26]  B SchneiderFred Implementing fault-tolerant services using the state machine approach: a tutorial , 1990 .

[27]  Maurice Herlihy,et al.  Axioms for concurrent objects , 1987, POPL '87.

[28]  Alexander Romanovsky Faulty version recovery in object-oriented N-version programming , 2000, IEE Proc. Softw..

[29]  Miguel Oom Temudo de Castro,et al.  Practical Byzantine fault tolerance , 1999, OSDI '99.

[30]  Yennun Huang,et al.  Software rejuvenation: analysis, module and applications , 1995, Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[31]  Ieee Standards Board System application program interface (API) (C language) , 1990 .

[32]  David J. DeWitt,et al.  The oo7 Benchmark , 1993, SIGMOD Conference.

[33]  Michael K. Reiter A Secure Group Membership Protocol , 1996, IEEE Trans. Software Eng..

[34]  Liming Chen,et al.  N-VERSION PROGRAMMINC: A FAULT-TOLERANCE APPROACH TO RELlABlLlTY OF SOFTWARE OPERATlON , 1995, Twenty-Fifth International Symposium on Fault-Tolerant Computing, 1995, ' Highlights from Twenty-Five Years'..

[35]  Ronald L. Rivest,et al.  The MD5 Message-Digest Algorithm , 1992, RFC.

[36]  Miguel Castro,et al.  Proactive recovery in a Byzantine-fault-tolerant system , 2000, OSDI.

[37]  Kyle Geiger,et al.  Inside ODBC , 1995 .

[38]  Sanjay Ghemawat,et al.  The Modified Object Buffer: A Storage Management Technique for Object-Oriented Databases , 1995 .

[39]  Miguel Castro,et al.  A Correctness Proof for a Practical Byzantine-Fault-Tolerant Replication Algorithm , 1999 .

[40]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[41]  Bruce G. Lindsay,et al.  Efficient commit protocols for the tree of processes model of distributed transactions , 1985, OPSR.

[42]  Michael Williams,et al.  Replication in the harp file system , 1991, SOSP '91.

[43]  Barbara Liskov,et al.  Fault-tolerant distributed garbage collection in a client-server object-oriented database , 1994, Proceedings of 3rd International Conference on Parallel and Distributed Information Systems.

[44]  Louise E. Moser,et al.  The SecureRing protocols for securing group communication , 1998, Proceedings of the Thirty-First Hawaii International Conference on System Sciences.

[45]  Miguel Castro,et al.  Using abstraction to improve fault tolerance , 2001, Proceedings Eighth Workshop on Hot Topics in Operating Systems.

[46]  Miguel Castro,et al.  BASE: using abstraction to improve fault tolerance , 2001, SOSP.

[47]  Sarah Ahmed,et al.  A Scalable Byzantine Fault Tolerant Secure Domain Name System , 2001 .

[48]  Miguel Castro,et al.  HAC: hybrid adaptive caching for distributed storage systems , 1997, SOSP.