Self-healing by means of automatic workarounds

We propose to use automatic workarounds to achieve self-healing in software systems. We observe that software systems of significant complexity, especially those made of components, are often redundant, in the sense that the same functionality and the same state-transition can be obtained through multiple sequences of operations. This redundancy is the basis to construct effective workarounds for component failures. In particular, we assume that failures can be detected and intercepted together with a trace of the operations that lead to the failure. Given the failing sequence, the system autonomically executes one or more alternative sequences that are known to have an equivalent behavior. We argue that such workarounds can be derived with reasonable effort from many forms of specifications, that they can be effectively prioritized either statically or dynamically, and that they can be deployed at run time in a completely automated way, and therefore that they amount to a valid self-healing mechanism. We develop this notion of self-healing by detailing a method to represent, derive, and deploy workarounds. We validate our method in two case studies.

[1]  Debzani Deb,et al.  Adding Self-Healing Capabilities into Legacy Object Oriented Application , 2006, International Conference on Autonomic and Autonomous Systems (ICAS'06).

[2]  George Candea,et al.  Microreboot - A Technique for Cheap Recovery , 2004, OSDI.

[3]  Rean Griffith,et al.  Manipulating managed execution runtimes to support self-healing systems , 2005 .

[4]  Standard Glossary of Software Engineering Terminology , 1990 .

[5]  T. S. E. Maibaum,et al.  Towards specification, modelling and analysis of fault tolerance in self managed systems , 2006, SEAMS '06.

[6]  Rui Zhang Modeling Autonomic Recovery in Web Services with Multi-tier Reboots , 2007, IEEE International Conference on Web Services (ICWS 2007).

[7]  M. Muztaba Fuad,et al.  Transformation of Existing Programs into Autonomic and Self-healing Entities , 2007, 14th Annual IEEE International Conference and Workshops on the Engineering of Computer-Based Systems (ECBS'07).

[8]  Rogério de Lemos,et al.  Architectural Mismatch Tolerance , 2002, WADS.

[9]  Amer Diwan,et al.  A tool for writing and debugging algebraic specifications , 2004, Proceedings. 26th International Conference on Software Engineering.

[10]  Laura L. Pullum,et al.  Software Fault Tolerance Techniques and Implementation , 2001 .

[11]  Leonardo Mariani,et al.  Towards Self-Protecting Enterprise Applications , 2007, The 18th IEEE International Symposium on Software Reliability (ISSRE '07).

[12]  David Harel,et al.  Executable object modeling with statecharts , 1996, Proceedings of IEEE 18th International Conference on Software Engineering.

[13]  Phyllis G. Frankl,et al.  The ASTOOT approach to testing object-oriented programs , 1994, TSEM.

[14]  Arnaud Gotlieb,et al.  Improving Constraint-Based Testing with Dynamic Linear Relaxations , 2007, The 18th IEEE International Symposium on Software Reliability (ISSRE '07).