Fault-local distributed mending (extended abstract)

As communication networks grow, existing fault handling tools that involve global measures such as global time-outs or reset procedures become increasingly unaffordable, since their cost grows with the size of the network. Rather, for a fault handling mechanism to scale to large networks, its cost must depend only on the number of failed nodes (which, thanks to today’s technology, grows much slower than the net works). Moreover, it should allow the non-faulty regions of the networks to continue their operation even during the recovery of the faulty parts. This abstract introduces the concepts fault locality, and of fault-locally mendable problems, which are problems for which there exist correction algorithms (applied after faults) whose cost depends only on the (unknown) number of faults. We show that any problem is fault locally mendable. The solution involves a novel technique combining data structures and “local votes” among nodes, that may be of interest in itself.

[1]  Moni Naor,et al.  Local computations on static and dynamic graphs , 1995, Proceedings Third Israel Symposium on the Theory of Computing and Systems.

[2]  Moti Yung,et al.  Memory-Efficient Self Stabilizing Protocols for General Networks , 1990, WDAG.

[3]  Eric C. Rosen,et al.  The New Routing Algorithm for the ARPANET , 1980, IEEE Trans. Commun..

[4]  Sam Toueg,et al.  The weakest failure detector for solving consensus , 1992, PODC '92.

[5]  Boaz Patt-Shamir,et al.  Self-stabilization by local checking and correction , 1991, [1991] Proceedings 32nd Annual Symposium of Foundations of Computer Science.

[6]  Baruch Awerbuch,et al.  Communication-optimal maintenance of replicated information , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.

[7]  S. Finn Resynch Procedures and a Fail-Safe Network Protocol , 1979, IEEE Trans. Commun..

[8]  Arobinda Gupta,et al.  Fault-containing self-stabilizing algorithms , 1996, PODC '96.

[9]  Boaz Patt-Shamir,et al.  Self-Stabilization by Local Checking and Global Reset (Extended Abstract) , 1994, WDAG.

[10]  Vassos Hadzilacos,et al.  Using Failure Detectors to Solve Consensus in Asynchronous Sharde-Memory Systems (Extended Abstract) , 1994, WDAG.

[11]  Richard M. Karp,et al.  Optimal broadcast and summation in the LogP model , 1993, SPAA '93.

[12]  Edsger W. Dijkstra,et al.  Self-stabilizing systems in spite of distributed control , 1974, CACM.

[13]  Michael E. Saks,et al.  Sphere packing and local majorities in graphs , 1993, [1993] The 2nd Israel Symposium on Theory and Computing Systems.

[14]  Amotz Bar-Noy,et al.  Designing broadcasting algorithms in the postal model for message-passing systems , 1992, SPAA '92.

[15]  Shay Kutten,et al.  Time Optimal Self-Stabilizing Spanning Tree Algorithms , 1993, FSTTCS.

[16]  B. Awerbuch,et al.  Distributed program checking: a paradigm for building self-stabilizing distributed protocols , 1991, [1991] Proceedings 32nd Annual Symposium of Foundations of Computer Science.

[17]  Nancy A. Lynch,et al.  Fast allocation of nearby resources in a distributed system , 1980, STOC '80.

[18]  Sam Toueg,et al.  Unreliable failure detectors for asynchronous systems (preliminary version) , 1991, PODC '91.

[19]  Nathan Linial,et al.  Locality in Distributed Graph Algorithms , 1992, SIAM J. Comput..

[20]  Lenore Cowen,et al.  Efficient asynchronous distributed symmetry breaking , 1994, STOC '94.

[21]  Aravind Srinivasan,et al.  Improved distributed algorithms for coloring and network decomposition problems , 1992, STOC '92.

[22]  Boaz Patt-Shamir,et al.  Time optimal self-stabilizing synchronization , 1993, STOC.

[23]  Imrich Chlamtac,et al.  Distributed Nodes Organization Algorithm for Channel Access in a Multihop Dynamic Radio Network , 1987, IEEE Transactions on Computers.

[24]  Stéphane Pérennes,et al.  Tight Bounds on the Size of 2-Monopolies , 1996, SIROCCO.

[25]  Shlomi Dolev,et al.  SuperStabilizing protocols for dynamic distributed systems , 1995, PODC '95.

[26]  Michael Luby,et al.  A simple parallel algorithm for the maximal independent set problem , 1985, STOC '85.

[27]  Douglas Comer,et al.  Internetworking with TCP/IP , 1988 .

[28]  Andrew V. Goldberg,et al.  Parallel symmetry-breaking in sparse graphs , 1987, STOC.

[29]  Ramesh Subramonian,et al.  LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.

[30]  D. Peleg Local Majority Voting, Small Coalitions and Controlling Monopolies in Graphs: A Review , 1996 .

[31]  Boaz Patt-Shamir,et al.  Time-adaptive self stabilization , 1997, PODC '97.

[32]  Baruch Awerbuch,et al.  Applying static network protocols to dynamic networks , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[33]  Moni Naor,et al.  What can be computed locally? , 1993, STOC.

[34]  Shay Kutten,et al.  New models and algorithms for future networks , 1988, PODC '88.

[35]  B. Awerbuch,et al.  Memory-eecient and Self-stabilizing Network Reset , 2007 .

[36]  Leonid A. Levin,et al.  Fast and lean self-stabilizing asynchronous protocols , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.

[37]  Danny Dolev,et al.  Early stopping in Byzantine agreement , 1990, JACM.

[38]  Shay Kutten,et al.  Tight fault locality , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.