We have developed a software toolkit, called CoBFIT, that implements a comprehensive suite of protocols for efficient and dynamic replication of a stateful service in an asynchronous environment such as the Internet. The CoBFIT toolkit can be used to make a generic service fault-tolerant using the state machine replication approach [7], provided the service can be structured as a deterministic state machine. The toolkit is implemented in C++ using the ACE object-oriented network programming toolkit [6] and is on the verge of an open-source release. Figure 1 shows the various components in the CoBFIT toolkit. The framework components form the foundation upon which protocol components are built. Detailed descriptions of all the CoBFIT protocol and framework components are given in [5]. The CoBFIT toolkit implements many framework components that provide a common foundation not only for the specific protocols that we have implemented, but for similar distributed fault-tolerant protocols as well. The foundation includes several services that are commonly needed for implementing distributed protocols, including event handling, network communication, management of protocol components, and cryptographic primitives (by interfacing with the Cryptlib library [1]). The CoBFIT suite of protocols is unique in that unlike the protocols in previous group communication systems, ours do not make correctness conditional upon the ability to remove suspected members from the group. The CoBFIT protocols can be utilized individually to provide their specific properties (e.g., reliable broadcast) as well as together in concert to realize a dynamic replication group that satisfies the virtual synchrony property [4] in the asynchronous model. The toolkit includes the following asynchronous Byzantine-faulttolerant protocols:
[1]
Frank Buschmann,et al.
C++ Network Programming: Systematic Reuse with ACE and Frameworks, Vol. 2
,
2002
.
[2]
William H. Sanders,et al.
Parsimonious service replication for tolerating malicious attacks in asynchronous environments
,
2006
.
[3]
Roy Friedman,et al.
Strong and weak virtual synchrony in Horus
,
1996,
Proceedings 15th Symposium on Reliable Distributed Systems.
[4]
Victor Shoup,et al.
Secure and Efficient Asynchronous Broadcast Protocols
,
2001,
CRYPTO.
[5]
Douglas C. Schmidt,et al.
Systematic reuse with ACE and frameworks
,
2003
.
[6]
Fred B. Schneider,et al.
Implementing fault-tolerant services using the state machine approach: a tutorial
,
1990,
CSUR.