Automated revision of distributed and real-time programs

This dissertation concentrates on the problem of automated revision of distributed and real-time programs that are correct-by-construction. In particular, our research addresses the following question: "if an existing program fails to satisfy a property, is it feasible to automatically revise the program inside its current state space and set of transitions, so that the revised program satisfies the failed property while it continues to satisfy its current properties?" We study this problem in two broad contexts: (1) revision in closed systems where programs do not interact with the environment, and (2) revision in open systems where programs are subject to a set of uncontrollable faults imposed by the environment. We refer to the former problem as "addition of a property to the input program" and the latter as "addition of fault-tolerance to the input program". We classify our results into three types: (1) polynomial-time sound and complete algorithms, (2) hardness results, and (3) sound efficient heuristics. Throughout this dissertation, we focus on three types of programs: (1) untimed centralized, (2) untimed distributed, and (3) centralized real-time. The reason for omitting distributed real-time programs is due to the fact that the structure of such programs are very complex and, hence, their formal analysis involves highly complex decision procedures. Thus, it is more beneficial to study the effect of the notions of distribution and time on programs separately in order to identify the stumbling blocks. Regarding addition of properties to programs in closed systems, we focus on UNITY safety and progress properties. Our interest in UNITY properties is due to the fact that they have been found highly expressive in specifying a large class of programs. Regarding addition of fault-tolerance to existing fault-intolerant programs, we consider three levels of fault-tolerance, namely failsafe, nonmasking, and masking, based on satisfaction of safety and liveness properties in the presence of faults. In order to capture time-related behaviors of programs in the presence of faults, we consider two additional levels, namely soft and hard, based on satisfaction of timing constraints in the presence of faults. We address some of the implementation difficulties using BDD-based heuristics for revising programs in both closed and open systems with respect to safety and progress properties. Our experimental results on synthesis of a variety of distributed programs show a significant performance improvement by several orders of magnitude in terms of time and space. We also introduce distributed and parallel techniques to improve the performance of our revision methods even further. Finally, we introduce our tool SYCRAFT which is capable of adding fault-tolerance to moderate-sized fault-intolerant distributed programs. In summary, this dissertation concludes that automated revision of moderate-sized programs (reachable states of size 1050 and beyond) is feasible in both theory and practice.

[1]  Insup Lee,et al.  Opportunities and Obligations for Physical Computing Systems , 2005, Computer.

[2]  Jonathan Ezekiel,et al.  Can saturation be parallelised?: on the parallelisation of a symbolic state-space generator , 2006 .

[3]  Anish Arora,et al.  Synthesis of fault-tolerant concurrent programs , 2004, TOPL.

[4]  Sudhakar M. Reddy,et al.  Deleting Vertices to Bound Path Length , 1994, IEEE Trans. Computers.

[5]  Edward A. Lee Cyber-physical Systems -are Computing Foundations Adequate? Position Paper for Nsf Workshop on Cyber-physical Systems: Research Motivation, Techniques and Roadmap , 1998 .

[6]  Gregory Gutin,et al.  Digraphs - theory, algorithms and applications , 2002 .

[7]  Borzoo Bonakdarpour,et al.  SYCRAFT: A Tool for Synthesizing Distributed Fault-Tolerant Programs , 2008, CONCUR.

[8]  Lei Feng,et al.  TCT: A Computation Tool for Supervisory Control Synthesis , 2006, 2006 8th International Workshop on Discrete Event Systems.

[9]  Constance L. Heitmeyer,et al.  Developing high assurance avionics systems with the SCR requirements method , 2000, 19th DASC. 19th Digital Avionics Systems Conference. Proceedings (Cat. No.00CH37126).

[10]  Eugene Asarin,et al.  As Soon as Possible: Time Optimal Control for Timed Automata , 1999, HSCC.

[11]  K. Mani Chandy,et al.  Parallel program design - a foundation , 1988 .

[12]  Patrick Cousot,et al.  Abstract Interpretation Frameworks , 1992, J. Log. Comput..

[13]  Paul C. Attie,et al.  Synthesis of concurrent systems with many similar processes , 1998, TOPL.

[14]  Amir Pnueli,et al.  Symbolic Controller Synthesis for Discrete and Timed Systems , 1994, Hybrid Systems.

[15]  Anish Arora,et al.  Polynomial time synthesis of Byzantine agreement , 2001, Proceedings 20th IEEE Symposium on Reliable Distributed Systems.

[16]  Sam Toueg,et al.  Unreliable failure detectors for reliable distributed systems , 1996, JACM.

[17]  A. Pnueli,et al.  CONTROLLER SYNTHESIS FOR TIMED AUTOMATA , 2006 .

[18]  John E. Hopcroft,et al.  The Directed Subgraph Homeomorphism Problem , 1978, Theor. Comput. Sci..

[19]  Anish Arora,et al.  ADDING FAULT-TOLERANCE TO STATE MACHINE-BASED DESIGNS , 2007 .

[20]  Thomas A. Henzinger,et al.  The Element of Surprise in Timed Games , 2003, CONCUR.

[21]  S. Lafortune,et al.  On tolerable and desirable behaviors in supervisory control of discrete event systems , 1990, 29th IEEE Conference on Decision and Control.

[22]  Gianfranco Ciardo,et al.  Saturation-Based Symbolic Reachability Analysis Using Conjunctive and Disjunctive Partitioning , 2005, CHARME.

[23]  Marcin Jurdzinski,et al.  Small Progress Measures for Solving Parity Games , 2000, STACS.

[24]  Anish Arora,et al.  Detectors and correctors: a theory of fault-tolerance components , 1998, Proceedings. 18th International Conference on Distributed Computing Systems (Cat. No.98CB36183).

[25]  Paul Le Guernic,et al.  Synthesis of Discrete-Event Controllers Based on the Signal Environment , 2000, Discret. Event Dyn. Syst..

[26]  Neeraj Suri,et al.  On Systematic Design of Fast and Perfect Detectors , 2002 .

[27]  Daniel Mossé,et al.  A responsiveness approach for scheduling fault recovery in real-time systems , 1999, Proceedings of the Fifth IEEE Real-Time Technology and Applications Symposium.

[28]  Orna Kupferman,et al.  Safraless Compositional Synthesis , 2006, CAV.

[29]  Amir Pnueli,et al.  Synthesis of Reactive(1) Designs , 2006, VMCAI.

[30]  W. Murray Wonham,et al.  Think Globally, Act Locally: Decentralized Supervisory Control , 1991, 1991 American Control Conference.

[31]  Edmund M. Clarke,et al.  Using Branching Time Temporal Logic to Synthesize Synchronization Skeletons , 1982, Sci. Comput. Program..

[32]  Borzoo Bonakdarpour,et al.  Automated Incremental Synthesis of Timed Automata , 2006, FMICS/PDMC.

[33]  Xin He,et al.  Fault-containing self-stabilization using priority scheduling , 2000, Inf. Process. Lett..

[34]  Paul C. Attie,et al.  Synthesis of concurrent programs for an atomic read/write model of computation , 2001, TOPL.

[35]  Walter Murray Wonham,et al.  On the complexity of supervisory control design in the RW framework , 2000, IEEE Trans. Syst. Man Cybern. Part B.

[36]  Stéphane Lafortune,et al.  Minimal communication in a distributed discrete-event system , 2003, IEEE Trans. Autom. Control..

[37]  Anish Arora,et al.  Automating the Addition of Fault-Tolerance , 2000, FTRTFT.

[38]  Wang Yi,et al.  Uppaal in a nutshell , 1997, International Journal on Software Tools for Technology Transfer.

[39]  A. Carruth Real-Time Unity , 1994 .

[40]  Ali Ebnenasir Diconic addition of failsafe fault-tolerance , 2007, ASE '07.

[41]  Miroslaw Malek,et al.  Minimum Achievable Utilization for Fault-Tolerant Processing of Periodic Tasks , 1998, IEEE Trans. Computers.

[42]  Wolfgang Thomas,et al.  On the Synthesis of Strategies in Infinite Games , 1995, STACS.

[43]  Borzoo Bonakdarpour,et al.  Revising Distributed UNITY Programs Is NP-Complete , 2008, OPODIS.

[44]  Thomas A. Henzinger,et al.  Symbolic Model Checking for Real-Time Systems , 1994, Inf. Comput..

[45]  Amir Pnueli,et al.  On the Synthesis of an Asynchronous Reactive Module , 1989, ICALP.

[46]  Martin Leucker,et al.  Parallel Model Checking for LTL, CTL*, and Lµ2 , 2003, PDMC@CAV.

[47]  Wolfgang Thomas,et al.  Infinite Games and Verification (Extended Abstract of a Tutorial) , 2002, CAV.

[48]  Rami G. Melhem,et al.  Tolerance to Multiple Transient Faults for Aperiodic Tasks in Hard Real-Time Systems , 2000, IEEE Trans. Computers.

[49]  Thomas A. Henzinger,et al.  The benefits of relaxing punctuality , 1991, JACM.

[50]  Betty H. C. Cheng,et al.  A semantically oriented program synthesis system , 1989, [1989] Proceedings of the Twenty-Second Annual Hawaii International Conference on System Sciences. Volume II: Software Track.

[51]  Borzoo Bonakdarpour,et al.  Exploiting Symbolic Techniques in Automated Synthesis of Distributed Programs with Large State Space , 2007, 27th International Conference on Distributed Computing Systems (ICDCS '07).

[52]  Edsger W. Dijkstra,et al.  A Discipline of Programming , 1976 .

[53]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[54]  Liaw Heh-Tyan,et al.  Efficient automatic diagnosis of digital circuits , 1990, 1990 IEEE International Conference on Computer-Aided Design. Digest of Technical Papers.

[55]  Borzoo Bonakdarpour,et al.  Incremental Synthesis of Fault-Tolerant Real-Time Programs , 2006, SSS.

[56]  Richard D. Schlichting,et al.  Fail-stop processors: an approach to designing fault-tolerant computing systems , 1983, TOCS.

[57]  Ali Ebnenasir,et al.  Complexity results in revising UNITY programs , 2009, TAAS.

[58]  Axel Legay,et al.  An Introduction to the Tool Ticc , 2006, Trustworthy Software.

[59]  Kathi Fisler,et al.  Is There a Best Symbolic Cycle-Detection Algorithm? , 2001, TACAS.

[60]  Pierre Wolper,et al.  Automata theoretic techniques for modal logics of programs: (Extended abstract) , 1984, STOC '84.

[61]  W. M. Wonham,et al.  The control of discrete event systems , 1989 .

[62]  Olivier Coudert,et al.  Automating the diagnosis and the rectification of design errors with PRIAM , 1989, ICCAD 1989.

[63]  Thomas Wilke,et al.  Synthesis of Distributed Systems from Knowledge-Based Specifications , 2005, CONCUR.

[64]  Leslie Lamport,et al.  The Byzantine Generals Problem , 1982, TOPL.

[65]  Bowen Alpern,et al.  Defining Liveness , 1984, Inf. Process. Lett..

[66]  Zohar Manna,et al.  A Deductive Approach to Program Synthesis , 1979, TOPL.

[67]  Wolfgang Thomas,et al.  Observations on determinization of Büchi automata , 2005, Theor. Comput. Sci..

[68]  E. Emerson,et al.  Tree Automata, Mu-Calculus and Determinacy (Extended Abstract) , 1991, FOCS 1991.

[69]  Rajeev Alur,et al.  Minimization of Timed Transition Systems , 1992, CONCUR.

[70]  Carla Piazza,et al.  Computing strongly connected components in a linear number of symbolic steps , 2003, SODA '03.

[71]  Gianfranco Ciardo,et al.  A dynamic firing speculation to speedup distributed symbolic state-space generation , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[72]  Edmund M. Clarke,et al.  Symbolic Model Checking with Partitioned Transistion Relations , 1991, VLSI.

[73]  Mihalis Yannakakis,et al.  Minimum and maximum delay problems in real-time systems , 1991, Formal Methods Syst. Des..

[74]  Anish Arora,et al.  Component based design of fault-tolerance , 1999 .

[75]  Sudhakar M. Reddy,et al.  Vertex Splitting in Dags and Applications to Partial Scan Designs and Lossy Circuits , 1998, Int. J. Found. Comput. Sci..

[76]  David L. Dill,et al.  Parallelizing the Murϕ Verifier , 2001, Formal Methods Syst. Des..

[77]  Paul C. Attie,et al.  Synthesis of Large Concurrent Programs via Pairwise Composition , 1999, CONCUR.

[78]  Thomas A. Henzinger,et al.  Real-Time Logics: Complexity and Expressiveness , 1993, Inf. Comput..

[79]  Martín Abadi,et al.  Realizable and Unrealizable Specifications of Reactive Systems , 1989, ICALP.

[80]  David L. Dill,et al.  Parallelizing the Murphi Verifier , 1997, CAV.

[81]  Wolfgang Thomas,et al.  Symbolic Synthesis of Finite-State Controllers for Request-Response Specifications , 2003, CIAA.

[82]  Assaf Schuster,et al.  Achieving Scalability in Parallel Reachability Analysis of Very Large Circuits , 2000, CAV.

[83]  Thomas A. Henzinger,et al.  The theory of hybrid automata , 1996, Proceedings 11th Annual IEEE Symposium on Logic in Computer Science.

[84]  Felix C. Freiling,et al.  An exercise in proving convergence through transfer functions , 1999, Proceedings 19th IEEE International Conference on Distributed Computing Systems.

[85]  Edmund M. Clarke,et al.  Symbolic Model Checking: 10^20 States and Beyond , 1990, Inf. Comput..

[86]  Sam Toueg,et al.  The weakest failure detector for solving consensus , 1992, PODC '92.

[87]  Ronald Fagin,et al.  Reasoning about knowledge , 1995 .

[88]  Betty H. C. Cheng,et al.  A Pattern-Based Approach for Modeling and Analyzing Error Recovery , 2006, WADS.

[89]  Ali Ebnenasir,et al.  SAT-Based Synthesis of Fault-Tolerance 1 , .

[90]  J. Van Leeuwen,et al.  Handbook of theoretical computer science - Part A: Algorithms and complexity; Part B: Formal models and semantics , 1990 .

[91]  K. Rohloff,et al.  Computations on distributed discrete -event systems , 2004 .

[92]  Ali Ebnenasir,et al.  Enhancing the fault-tolerance of nonmasking programs , 2003, 23rd International Conference on Distributed Computing Systems, 2003. Proceedings..

[93]  Ali Ebnenasir,et al.  Complexity issues in automated synthesis of failsafe fault-tolerance , 2005, IEEE Transactions on Dependable and Secure Computing.

[94]  Robert K. Brayton,et al.  Language containment of non-deterministic omega-automata , 1995, CHARME.

[95]  W. M. Wonham,et al.  Decentralized control and coordination of discrete-event systems with partial observation , 1990 .

[96]  Bernd Finkbeiner,et al.  SMT-based synthesis of distributed systems , 2007, AFM '07.

[97]  Sanjit A. Seshia,et al.  Combinatorial sketching for finite programs , 2006, ASPLOS XII.

[98]  David Eppstein,et al.  Finding the k shortest paths , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.

[99]  Anish Arora,et al.  Disassembling real-time fault-tolerant programs , 2008, EMSOFT '08.

[100]  Rami G. Melhem,et al.  Optimal scheduling of imprecise computation tasks in the presence of multiple faults , 2000, Proceedings Seventh International Conference on Real-Time Computing Systems and Applications.

[101]  Roderick Bloem,et al.  Optimizations for LTL Synthesis , 2006, 2006 Formal Methods in Computer Aided Design.

[102]  Lu Ru-zhan,et al.  Program synthesis based on Boyer-Moore theorem proving techniques , 1985, CSC '85.

[103]  Olivier Coudert,et al.  Automating the diagnosis and the rectification of design errors with PRIAM , 1989, 1989 IEEE International Conference on Computer-Aided Design. Digest of Technical Papers.

[104]  David Gries,et al.  The Science of Programming , 1981, Text and Monographs in Computer Science.

[105]  Pierre Wolper,et al.  Synthesis of Communicating Processes from Temporal Logic Specifications , 1981, Logic of Programs.

[106]  R. BurchJ.,et al.  Symbolic model checking , 1992 .

[107]  Amir Pnueli,et al.  On the synthesis of a reactive module , 1989, POPL '89.

[108]  Georg Gottlob,et al.  Enhancing Model Checking in Verification by AI Techniques , 1999, Artif. Intell..

[109]  Roderick Bloem,et al.  Program Repair as a Game , 2005, CAV.

[110]  Rami G. Melhem,et al.  A Nonpreemptive Real-Time Scheduler with Recovery from Transient Faults and Its Implementation , 2003, IEEE Trans. Software Eng..

[111]  Thomas A. Henzinger,et al.  Sooner is Safer Than Later , 1992, Inf. Process. Lett..

[112]  Fuad Abujarad,et al.  Parallelizing Deadlock Resolution in Symbolic Synthesis of Distributed Programs , 2009, PDMC.

[113]  Anish Arora,et al.  FTSyn: a framework for automatic synthesis of fault-tolerance , 2008, International Journal on Software Tools for Technology Transfer.

[114]  Roderick Bloem,et al.  Anzu: A Tool for Property Synthesis , 2007, CAV.

[115]  P. Madhusudan,et al.  Timed Control Synthesis for External Specifications , 2002, STACS.

[116]  Randal E. Bryant,et al.  Graph-Based Algorithms for Boolean Function Manipulation , 1986, IEEE Transactions on Computers.

[117]  Zohar Manna,et al.  Fundamentals of Deductive Program Synthesis , 1992, IEEE Trans. Software Eng..

[118]  Sandeep S. Kulkarni,et al.  Infuse: A TDMA Based Data Dissemination Protocol for Sensor Networks , 2006, Int. J. Distributed Sens. Networks.

[119]  Borzoo Bonakdarpour,et al.  Masking Faults While Providing Bounded-Time Phased Recovery , 2008, FM.

[120]  Aniello Murano,et al.  Dense Real-Time Games , 2002, LICS.

[121]  Deepak D'Souza,et al.  Timed Control with Partial Observability , 2003, CAV.

[122]  Ali Ebnenasir,et al.  Automated synthesis of multitolerance , 2004, International Conference on Dependable Systems and Networks, 2004.

[123]  John M. Rushby,et al.  An Overview of Formal Verification for the Time-Triggered Architecture , 2002, FTRTFT.

[124]  Somesh Jha,et al.  Exploiting Symmetry In Temporal Logic Model Checking , 1993, CAV.

[125]  Dejan Nickovic,et al.  From MITL to Timed Automata , 2006, FORMATS.

[126]  Natarajan Shankar,et al.  A case-study in component-based mechanical verification of fault-tolerant programs , 1999, Proceedings 19th IEEE International Conference on Distributed Computing Systems.

[127]  Joseph Sifakis,et al.  Controller Synthesis for Timed Automata 1 , 1998 .

[128]  Thomas A. Henzinger,et al.  Interface automata , 2001, ESEC/FSE-9.

[129]  Kenneth L. McMillan,et al.  Symbolic model checking , 1992 .

[130]  Arshad Jhumka,et al.  Automating the Addition of Fail-Safe Fault-Tolerance: Beyond Fusion-Closed Specifications , 2004, FORMATS/FTRTFT.

[131]  Gianfranco Ciardo,et al.  Saturation: An Efficient Iteration Strategy for Symbolic State-Space Generation , 2001, TACAS.

[132]  umar. janjua,et al.  Automatic Correction to Safety Violations in Programs , 2006 .

[133]  Neeraj Suri,et al.  Component-Based Synthesis of Dependable Embedded Software , 2002, FTRTFT.

[134]  Stavros Tripakis,et al.  Fault Diagnosis for Timed Automata , 2002, FTRTFT.

[135]  Radu Mateescu,et al.  Parallel state space construction for model-checking , 2001, SPIN '01.

[136]  A. Prasad Sistla,et al.  Symmetry and model checking , 1993, Formal Methods Syst. Des..

[137]  Friedemann Mattern,et al.  Algorithms for distributed termination detection , 1987, Distributed Computing.

[138]  Fuad Abujarad,et al.  Distributed Synthesis of Fault-Tolerant Programs in the High Atomicity Model , 2007, SSS.

[139]  Ronald Fagin,et al.  Knowledge-based programs , 1997, Distributed Computing.

[140]  Anish Arora,et al.  Component Based Design of Multitolerant Systems , 1998, IEEE Trans. Software Eng..

[141]  Chin-Laung Lei,et al.  Efficient Model Checking in Fragments of the Propositional Mu-Calculus (Extended Abstract) , 1986, LICS.

[142]  Natarajan Shankar,et al.  Abstract and Model Check While You Prove , 1999, CAV.

[143]  Anish Arora,et al.  Closure and Convergence: A Foundation of Fault-Tolerant Computing , 1993, IEEE Trans. Software Eng..

[144]  Fabio Somenzi,et al.  An Algorithm for Strongly Connected Component Analysis in n log n Symbolic Steps , 2006, Formal Methods Syst. Des..

[145]  Thomas A. Henzinger,et al.  Real-time system = discrete system + clock variables , 1994, International Journal on Software Tools for Technology Transfer.

[146]  Patricia Bouyer,et al.  Fault Diagnosis Using Timed Automata , 2005, FoSSaCS.

[147]  R. Malik,et al.  Supremica - An integrated environment for verification, synthesis and simulation of discrete event systems , 2006, 2006 8th International Workshop on Discrete Event Systems.

[148]  Ali Ebnenasir,et al.  Adding Fault-Tolerance Using Pre-synthesized Components , 2005, EDCC.

[149]  Rajeev Alur,et al.  A Theory of Timed Automata , 1994, Theor. Comput. Sci..

[150]  Rami G. Melhem,et al.  Scheduling optional computations in fault-tolerant real-time systems , 2000, Proceedings Seventh International Conference on Real-Time Computing Systems and Applications.

[151]  Ali Ebnenasir,et al.  Revising UNITY Programs: Possibilities and Limitations , 2005, OPODIS.

[152]  Ibrahim N. Hajj,et al.  Logic design error diagnosis and correction , 1994, IEEE Trans. Very Large Scale Integr. Syst..

[153]  Ali Ebnenasir,et al.  The complexity of adding failsafe fault-tolerance , 2002, Proceedings 22nd International Conference on Distributed Computing Systems.