论文信息 - Fault-containing self-stabilizing algorithms

Fault-containing self-stabilizing algorithms

Self-stabilization provides a non-masking approach to fault tolerance. Given this fact, one would hope that in a self-stabilizing system, the amount of disruption caused by a fault is proportional to the severity of the fault. However, this is not true for many self-stabilizing systems. Our paper addresses this weakness of distributed self-stabilizing systems by introducing the notion of fault containment. Informally, a fault-containing self-stabilizing algorithm is one that contains the effects of limited transient faults while retaining the property of self-st abilization. The paper begins with a formal framework for specifying and evaluating fault-containing self-stabilizing protocols. Then, it is shown that self-stabilization and fault containment are goals that can conflict. For example, it is shown that imposing a O(1) bound on the worst case recovery time from a l-faulty state necessitates added overhead for stabilization: for some tasks, the O(1) recovery time implies sfiabilization time cannot be within O(1) rounds from the optimum value. The paper then presents a transformer T that maps any non-reactive self-stabilizing algorithm P into an equivalent fault-containing self-stabilizing algorithm Pf that can repair any l-faulty state in O(1) time with O(1) space overhead. This transformation is baaed on a novel stabilizing timer paradigm that significantly simplifies the ti=k of fault containment. The paper concludes by generalizing the transformer ‘T into a parameterized transformer 7(k) such that for varying k we obtain varying performance measures for Pf.

[1] Zohar Manna,et al. The Temporal Logic of Reactive and Concurrent Systems , 1991, Springer New York.

[2] Mohamed G. Gouda,et al. The Triumph and Tribulation of System Stabilization , 1995, WDAG.

[3] Baruch Awerbuch,et al. Complexity of network synchronization , 1985, JACM.

[4] Russ Abbott,et al. Resourceful systems for fault tolerance, reliability, and safety , 1990, CSUR.

[5] Victor P. Nelson. Fault-tolerant computing: fundamental concepts , 1990, Computer.

[6] Anish Arora,et al. Closure and Convergence: A Foundation of Fault-Tolerant Computing , 1993, IEEE Trans. Software Eng..

[7] Moti Yung,et al. Non-Exploratory Self-Stabilization for Constant-Space Symmetry-Breaking , 1994, ESA.

[8] Shay Kutten,et al. Fault-local distributed mending (extended abstract) , 1995, PODC '95.

[9] Boaz Patt-Shamir,et al. Self-stabilization by local checking and correction , 1991, [1991] Proceedings 32nd Annual Symposium of Foundations of Computer Science.

[10] Shlomi Dolev,et al. SuperStabilizing protocols for dynamic distributed systems , 1995, PODC '95.

[11] Ran El-Yaniv,et al. Memory Adaptive Self-Stabilizing Protocols (Extended Abstract) , 1992, WDAG.

[12] Shing-Tsaan Huang,et al. A Self-Stabilizing Algorithm for Constructing Spanning Trees , 1991, Inf. Process. Lett..