Adding Fault-Tolerance Using Pre-synthesized Components

We present a hybrid synthesis method for automatic addition of fault-tolerance to distributed programs. In particular, we automatically specify and add pre-synthesized fault-tolerance components to programs in the cases where existing heuristics fail to add fault-tolerance. Such addition of pre-synthesized components has the advantage of reusing pre-synthesized fault-tolerance components in the synthesis of different programs, and as a result, reusing the effort put in the synthesis of one program for the synthesis of another program. Our synthesis method is sound in that the synthesized fault-tolerant program satisfies its specification in the absence of faults, and provides desired level of fault-tolerance in the presence of faults. We illustrate our synthesis method by adding pre-synthesized components with linear topology to a token ring program that tolerates the corruption of all processes. Also, we have reused the same component in the synthesis of a fault-tolerant alternating bit protocol. Elsewhere, we have applied this method for adding presynthesized components with hierarchical topology.

[1]  Edsger W. Dijkstra,et al.  Self-stabilizing systems in spite of distributed control , 1974, CACM.

[2]  Mathai Joseph,et al.  Transformation of programs for fault-tolerance , 2005, Formal Aspects of Computing.

[3]  Edsger W. Dijkstra,et al.  A Discipline of Programming , 1976 .

[4]  Vijay K. Garg,et al.  Techniques for analyzing distributed computations , 2002 .

[5]  Paul C. Attie,et al.  Synthesis of concurrent programs for an atomic read/write model of computation , 2001, TOPL.

[6]  Jeannette M. Wing,et al.  Specification matching of software components , 1995, TSEM.

[7]  Alexander I. Tomlinson,et al.  Detecting relational global predicates in distributed systems , 1993, PADD '93.

[8]  Ali Ebnenasir,et al.  Hierarchical Presynthesized Components for Automatic Addition of Fault-Tolerance: A Case Study1 , 2004 .

[9]  Anish Arora,et al.  Synthesis of fault-tolerant concurrent programs , 2004, TOPL.

[10]  Bowen Alpern,et al.  Defining Liveness , 1984, Inf. Process. Lett..

[11]  Ali Ebnenasir,et al.  Enhancing the fault-tolerance of nonmasking programs , 2003, 23rd International Conference on Distributed Computing Systems, 2003. Proceedings..

[12]  Anish Arora,et al.  Component based design of fault-tolerance , 1999 .

[13]  Anish Arora,et al.  Polynomial time synthesis of Byzantine agreement , 2001, Proceedings 20th IEEE Symposium on Reliable Distributed Systems.

[14]  Anish Arora,et al.  Automating the Addition of Fault-Tolerance , 2000, FTRTFT.

[15]  Ali Ebnenasir,et al.  The complexity of adding failsafe fault-tolerance , 2002, Proceedings 22nd International Conference on Distributed Computing Systems.

[16]  Anish Arora,et al.  Component Based Design of Multitolerant Systems , 1998, IEEE Trans. Software Eng..

[17]  Anish Arora,et al.  Closure and Convergence: A Foundation of Fault-Tolerant Computing , 1993, IEEE Trans. Software Eng..

[18]  Anish Arora,et al.  Detectors and correctors: a theory of fault-tolerance components , 1998, Proceedings. 18th International Conference on Distributed Computing Systems (Cat. No.98CB36183).