Byzantine Fault Tolerance for Nondeterministic Applications

All practical applications contain some degree of non- determinism. When such applications are replicated to achieve Byzantine fault tolerance (BFT), their nondeterministic operations must be controlled to ensure replica consistency. To the best of our knowledge, only the most simplistic types of replica nondeterminism have been dealt with. Furthermore, there lacks a systematic approach to handling common types of nondeterminism. In this paper, we propose a classification of common types of replica nondeterminism with respect to the requirement of achieving Byzantine fault tolerance, and describe the design and implementation of the core mechanisms necessary to handle such nondeterminism within a Byzantine fault tolerance framework.

[1]  Ravishankar K. Iyer,et al.  Loose synchronization of multithreaded replicas , 2002, 21st IEEE Symposium on Reliable Distributed Systems, 2002. Proceedings..

[2]  LiskovBarbara,et al.  Practical byzantine fault tolerance and proactive recovery , 2002 .

[3]  T. J. Walls,et al.  How we Learned to Cheat in Online Poker: A Study in Software Security , 1999 .

[4]  Louise E. Moser,et al.  The SecureRing protocols for securing group communication , 1998, Proceedings of the Thirty-First Hawaii International Conference on System Sciences.

[5]  Miguel Castro,et al.  Using abstraction to improve fault tolerance , 2001, Proceedings Eighth Workshop on Hot Topics in Operating Systems.

[6]  Miguel Castro,et al.  BASE: using abstraction to improve fault tolerance , 2001, SOSP.

[7]  Liuba Shrira,et al.  HQ replication: a hybrid quorum protocol for byzantine fault tolerance , 2006, OSDI '06.

[8]  Gary McGraw,et al.  Building Secure Software : ソフトウェアセキュリティについて開発者が知っているべきこと , 2006 .

[9]  Wolfgang Graetsch,et al.  Fault tolerance under UNIX , 1989, TOCS.

[10]  Priya Narasimhan,et al.  Enforcing determinism for the consistent replication of multithreaded CORBA applications , 1999, Proceedings of the 18th IEEE Symposium on Reliable Distributed Systems.

[11]  I. Bey,et al.  Delta-4: A Generic Architecture for Dependable Distributed Computing , 1991, Research Reports ESPRIT.

[12]  Paul Feldman,et al.  A practical scheme for non-interactive verifiable secret sharing , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[13]  Adi Shamir,et al.  How to share a secret , 1979, CACM.

[14]  William R. Dieter,et al.  User-Level Checkpointing for LinuxThreads Programs , 2001, USENIX Annual Technical Conference, FREENIX Track.

[15]  Michael K. Reiter,et al.  Byzantine quorum systems , 1997, STOC '97.

[16]  Miguel Oom Temudo de Castro,et al.  Practical Byzantine fault tolerance , 1999, OSDI '99.

[17]  Victor Shoup,et al.  Practical Threshold Signatures , 2000, EUROCRYPT.

[18]  Yvo Desmedt,et al.  Threshold Cryptosystems , 1989, CRYPTO.

[19]  Priya Narasimhan,et al.  Living with Nondeterminism in Replicated Middleware Applications , 2006, Middleware.

[20]  Miguel Castro,et al.  Proactive recovery in a Byzantine-fault-tolerant system , 2000, OSDI.

[21]  Hugo Krawczyk,et al.  Robust Threshold DSS Signatures , 1996, Inf. Comput..

[22]  Arun Venkataramani,et al.  Separating agreement from execution for byzantine fault tolerant services , 2003, SOSP '03.

[23]  Louise E. Moser,et al.  Deterministic scheduling for multithreaded replicas , 2005, 10th IEEE International Workshop on Object-Oriented Real-Time Dependable Systems.

[24]  Leslie Lamport,et al.  Paxos Made Simple , 2001 .

[25]  Nancy A. Lynch,et al.  Impossibility of distributed consensus with one faulty process , 1983, PODS '83.

[26]  Jonathan Kirsch,et al.  Steward: Scaling Byzantine Fault-Tolerant Systems to Wide Area Networks , 2005 .

[27]  Miguel Castro,et al.  Practical byzantine fault tolerance and proactive recovery , 2002, TOCS.

[28]  B SchneiderFred Implementing fault-tolerant services using the state machine approach: a tutorial , 1990 .

[29]  Adi Shamir,et al.  A method for obtaining digital signatures and public-key cryptosystems , 1978, CACM.

[30]  Ricardo Jiménez-Peris,et al.  Deterministic scheduling for transactional multithreaded replicas , 2000, Proceedings 19th IEEE Symposium on Reliable Distributed Systems SRDS-2000.

[31]  Michael K. Reiter,et al.  An Architecture for Survivable Coordination in Large Distributed Systems , 2000, IEEE Trans. Knowl. Data Eng..

[32]  Ravishankar K. Iyer,et al.  A preemptive deterministic scheduling algorithm for multithreaded replicas , 2003, 2003 International Conference on Dependable Systems and Networks, 2003. Proceedings..

[33]  David H. Ackley,et al.  Building diverse computer systems , 1997, Proceedings. The Sixth Workshop on Hot Topics in Operating Systems (Cat. No.97TB100133).

[34]  Miguel Castro,et al.  Authenticated Byzantine Fault Tolerance Without Public-Key Cryptography , 1999 .

[35]  Thomas C. Bressoud,et al.  TFT: a software system for application-transparent fault tolerance , 1998, Digest of Papers. Twenty-Eighth Annual International Symposium on Fault-Tolerant Computing (Cat. No.98CB36224).

[36]  Fred B. Schneider,et al.  Hypervisor-based fault tolerance , 1996, TOCS.