论文信息 - Comparing primary-backup and state machines for crash failures

Comparing primary-backup and state machines for crash failures

Two of the more prevalent methods of constructing a highly available service are the primary-backup (e.g., [2]) and the state machine (e.g., [5]) approaches. Both methods are widely used, and a common wisdom has developed about their relative strengths and weaknesses. Our research attempts to make this comparison of relative strengths and weaknesses in a more formal manner under the crash failure model and in a synchronous system model. The metrics upon which this comparison are based refer to the service response tim~, which is the time elapsed from when a client initially sends a request to the service until the client receives the response from the service. Common wisdom says that in terms of bestcsse and expected service response times, primarybackup is better than state machines, but in terms of worst-case service response time, state machines are better than primary-backup. Hence, common wisdom argues that the state machine approach is a better choice for real-time systems in which schedulability is a concern, but primary-backup is a better choice for most other cases. Our comparison necessitated fully specifying the two methods. The primary-backup approach has been specified [2] in terms of how a service must appear to a client, but not in terms of how a client interacts with the service except that the client only sends requests to the process it believes to be the primary. We have considered three different client-service protocols. It is easy to show that the primary-backup approach for any of these clientservice protocols is optimal for the best-case service response time. Specifying the state machine approach has proven to be more difficult. We have found it convenient to categorize the approach into three different cases, depending on how the ordering of the reliable broadcasts from clients to the service is done. We refer to these as the sequencer approach (for example, [4]), the consensus approach (for example, [6]), and the a priori approach (for example, [3]). In the a priori approach, the order of delivery is determined by timestamping requests with a real clock value, and enqueuing them at the server until any earlier messages have been received. Thus, the ordering latency-that is, the time from when a client sends a request to the service until when the service delivers the request—is the same for all failure patterns, including failure free runs. This approach is provably optimal in the worst case, but a penalty is paid in that every request has the same ordering latency. In the consensus approach, the order of the requests are determined jointly by all of the (nonfaulty) servers in the system. In order to attain this consensus, a consensus protocol either must be run repeatedly [1] or must be initiated when a process of the service receives a client’s request. Running consensus repeatedly can yield a small ordering latency, but is very expensive in terms of the number of messages used. Starting consensus from the receipt of a client’s request, on the other hand, has a larger ordering latency but uses fewer messages. Neither is optimal in either the best-case service reponse time or the worst-case service response time. In the sequencer-based approach, there is infinitely often a single server that unilaterally decides the order of a set of requests. In this sense, it is analogous to the primary-backup approach, and some of the lower bounds for primary-backup apply to sequencer-based state machines as well. Furthermore, there exist protocols of this approach that are optimal in the best-case service response time and are better than primary-backup in expected service response time. Hence, contrary to common wisdom, we have found that in terms of best-case service response time, both primary-backup and state machines can be optimal. In terms of worst-case service response time, an optimal st ate machine service can be constructed, while an optimal primary-backup service cannot. However, the best-case service response time for this state machine service will be very poor. Finally, there are state machine protocols that have expected service response times better than any primary-backup protocol.

Keith Marzullo | Jeremy B. Sussman

[1] Henri E. Bal,et al. An efficient reliable broadcast protocol , 1989, OPSR.

[2] Fred B. Schneider,et al. Primary-Backup Protocols: Lower Bounds and Optimal Implementations , 1992 .

[3] Fred B. Schneider,et al. Implementing fault-tolerant services using the state machine approach: a tutorial , 1990, CSUR.

[4] Piotr Berman,et al. Quick Atomic Broadcast (Extended Abstract) , 1993, WDAG.