FTSyn: a framework for automatic synthesis of fault-tolerance

In this paper, we present a software framework for adding fault-tolerance to existing finite-state programs. The input to our framework is a fault-intolerant program and a class of faults that perturbs the program. The output of our framework is a fault-tolerant version of the input program. Our framework provides (1) the first automated tool for the synthesis of fault-tolerant distributed programs, and (2) an extensible platform for researchers to develop a repository of heuristics that deal with the complexity of adding fault-tolerance to distributed programs. We also present a set of heuristics for polynomial-time addition of fault-tolerance to distributed programs. We have used this framework for automated synthesis of several fault-tolerant programs including a simplified version of an aircraft altitude switch, token ring, Byzantine agreement, and agreement in the presence of Byzantine and fail-stop faults. These examples illustrate that our framework can be used for synthesizing programs that tolerate different types of faults (process restarts, Byzantine and fail-stop) and programs that are subject to multiple faults (Byzantine and fail-stop) simultaneously. We have found our framework to be highly useful for pedagogical purposes, especially for teaching concepts of fault-tolerance, automatic program transformation, and the effect of heuristics.

[2]  Ali Ebnenasir,et al.  Automatic synthesis of fault-tolerance , 2005 .

[3]  Bowen Alpern,et al.  Defining Liveness , 1984, Inf. Process. Lett..

[4]  Anish Arora,et al.  Component based design of fault-tolerance , 1999 .

[5]  Ali Ebnenasir,et al.  Enhancing the fault-tolerance of nonmasking programs , 2003, 23rd International Conference on Distributed Computing Systems, 2003. Proceedings..

[6]  Ali Ebnenasir,et al.  Complexity issues in automated synthesis of failsafe fault-tolerance , 2005, IEEE Transactions on Dependable and Secure Computing.

[7]  Ali Ebnenasir,et al.  Automated synthesis of multitolerance , 2004, International Conference on Dependable Systems and Networks, 2004.

[8]  Paul C. Attie,et al.  Synthesis of concurrent programs for an atomic read/write model of computation , 2001, TOPL.

[9]  Boaz Patt-Shamir,et al.  Self-stabilization by local checking and correction , 1991, [1991] Proceedings 32nd Annual Symposium of Foundations of Computer Science.

[10]  Paul C. Attie,et al.  Synthesis of Large Concurrent Programs via Pairwise Composition , 1999, CONCUR.

[11]  Ali Ebnenasir Diconic addition of failsafe fault-tolerance , 2007, ASE '07.

[12]  A. Joesang Security Protocol Verification Using SPIN , 1995 .

[13]  Klaus Havelund,et al.  Model checking JAVA programs using JAVA PathFinder , 2000, International Journal on Software Tools for Technology Transfer.

[14]  Edmund M. Clarke,et al.  Using Branching Time Temporal Logic to Synthesize Synchronization Skeletons , 1982, Sci. Comput. Program..

[15]  Ali Ebnenasir,et al.  SAT-Based Synthesis of Fault-Tolerance 1 , .

[16]  Christopher G. Lasater,et al.  Design Patterns , 2008, Wiley Encyclopedia of Computer Science and Engineering.

[17]  Anish Arora,et al.  Closure and Convergence: A Foundation of Fault-Tolerant Computing , 1993, IEEE Trans. Software Eng..

[18]  Leslie Lamport,et al.  The Byzantine Generals Problem , 1982, TOPL.

[19]  Anish Arora,et al.  Designing Masking Fault-Tolerance via Nonmasking Fault-Tolerance , 1998, IEEE Trans. Software Eng..

[20]  Anish Arora,et al.  Synthesis of fault-tolerant concurrent programs , 2004 .

[21]  Murat Demirbas,et al.  Convergence refinement , 2002, Proceedings 22nd International Conference on Distributed Computing Systems.

[22]  Gerard J. Holzmann,et al.  From code to models , 2001, Proceedings Second International Conference on Application of Concurrency to System Design.

[23]  Ralph Johnson,et al.  design patterns elements of reusable object oriented software , 2019 .

[24]  Arshad Jhumka,et al.  Automating the Addition of Fail-Safe Fault-Tolerance: Beyond Fusion-Closed Specifications , 2004, FORMATS/FTRTFT.

[25]  Paul C. Attie,et al.  Synthesis of concurrent systems with many similar processes , 1998, TOPL.

[26]  Anish Arora,et al.  Polynomial time synthesis of Byzantine agreement , 2001, Proceedings 20th IEEE Symposium on Reliable Distributed Systems.

[27]  Anish Arora,et al.  Stabilization-Preserving Atomicity Refinement , 2002, J. Parallel Distributed Comput..

[28]  Constance L. Heitmeyer,et al.  Developing high assurance avionics systems with the SCR requirements method , 2000, 19th DASC. 19th Digital Avionics Systems Conference. Proceedings (Cat. No.00CH37126).

[29]  Edsger W. Dijkstra,et al.  Self-stabilizing systems in spite of distributed control , 1974, CACM.

[30]  Jacques Julliand,et al.  Modeling and Verification of the RUBIS μ−Kernel with SPIN , 2002 .

[31]  Borzoo Bonakdarpour,et al.  Exploiting Symbolic Techniques in Automated Synthesis of Distributed Programs with Large State Space , 2007, 27th International Conference on Distributed Computing Systems (ICDCS '07).

[32]  Edsger W. Dijkstra,et al.  A Discipline of Programming , 1976 .

[33]  Anish Arora,et al.  Automating the Addition of Fault-Tolerance , 2000, FTRTFT.