A methodology for constructing a stabilizing crash-tolerant application

This paper is an exercise to construct a stabilizing mutual-exclusion protocol that withstands a single crash-failure. We begin with a collection of distributed processes arranged in a ring. The resulting protocol is stabilized by construction. Stabilizing protocols converge to a correct behavior regardless of their initial state. A faulty process is automatically removed from the system and, after repair, automatically integrated into the system. Our technique can be generalized to different systems by substituting appropriate protocols for various components.<<ETX>>

[1]  Leslie Lamport,et al.  The Byzantine Generals Problem , 1982, TOPL.

[2]  Anish Arora,et al.  Closure and convergence: a formulation of fault-tolerant computing , 1992, [1992] Digest of Papers. FTCS-22: The Twenty-Second International Symposium on Fault-Tolerant Computing.

[3]  Sam Toueg,et al.  Unreliable failure detectors for asynchronous systems (preliminary version) , 1991, PODC '91.

[4]  Sam Toueg,et al.  The weakest failure detector for solving consensus , 1992, PODC '92.

[5]  Maurice Herlihy,et al.  Specifying Graceful Degradation , 1991, IEEE Trans. Parallel Distributed Syst..

[6]  Sukumar Ghosh,et al.  Stabilizing algorithms for diagnosing crash failures , 1994, PODC '94.

[7]  Mohamed G. Gouda,et al.  Stabilizing Communication Protocols , 1991, IEEE Trans. Computers.

[8]  Edsger W. Dijkstra,et al.  Self-stabilizing systems in spite of distributed control , 1974, CACM.

[9]  Sukumar Ghosh An alternative solution to a problem on self-stabilization , 1993, TOPL.

[10]  Mohamed G. Gouda,et al.  Adaptive Programming , 1991, IEEE Trans. Software Eng..