A Deliberative Reasoner for Model-Based Software Health Management

While traditional design-time and off-line approaches to testing and verification contribute significantly to improving and ensuring high dependability of software, they may not cover all possible fault scenarios that a system could encounter at runtime. Thus, runtime ‘health management’ of complex embedded software systems is needed to improve their dependability. Our approach to Software Health Management uses concepts from the field of ‘Systems Health Management’: detection, diagnosis and mitigation. In earlier work we had shown how to use a reactive mitigation strategy specified using a timed state machine model for system health manager. This paper describes the algorithm and key concepts for an alternative approach to system mitigation using a deliberative strategy, which relies on a function-allocation model to identify alternative component-assembly configurations that can restore the functions needed for the goals of the system. Keywords-Component-based systems; fault diagnosis; autonomic computing; fault removal.

[1]  Douglas C. Schmidt,et al.  Overview of the CORBA component model , 2001 .

[2]  Betty H. C. Cheng,et al.  Model-based development of dynamically adaptive software , 2006, ICSE.

[3]  Brian C. Williams,et al.  Model-based programming of intelligent embedded systems and robotic space explorers , 2003, Proc. IEEE.

[4]  Gabor Karsai,et al.  Model-based software health management for real-time systems , 2011, 2011 Aerospace Conference.

[5]  Stephen B. Johnson,et al.  System Health Management: With Aerospace Applications , 2011 .

[6]  Michael R. Lyu Software Fault Tolerance , 1995 .

[7]  Bradley R. Schmerl,et al.  Increasing System Dependability through Architecture-Based Self-Repair , 2002, WADS.

[8]  Algirdas Avizienis,et al.  Software Fault Tolerance , 1989, IFIP Congress.

[9]  S. Ofsthun Integrated vehicle health management for aerospace platforms , 2002, IEEE Instrumentation & Measurement Magazine.

[10]  S. Abdelwahed,et al.  Practical considerations in systems diagnosis using timed failure propagation graph models , 2007, 2007 IEEE Autotestcon.

[11]  Gabor Karsai,et al.  Application of software health management techniques , 2011, SEAMS '11.

[12]  Wilhelm Hasselbring,et al.  Model-driven Development of Self-managing Software Systems , 2006 .

[13]  Michel R. V. Chaudron Models in Software Engineering, Workshops and Symposia at MODELS 2008, Toulouse, France, September 28 - October 3, 2008. Reports and Revised Selected Papers , 2009, MoDELS Workshops.

[14]  Gabor Karsai,et al.  A component model for hard real‐time systems: CCM with ARINC‐653 , 2011, Softw. Pract. Exp..

[15]  Paul Robertson,et al.  Automatic recovery from software failure , 2006, CACM.

[16]  Mary Shaw,et al.  Software Engineering for Self-Adaptive Systems: A Research Roadmap , 2009, Software Engineering for Self-Adaptive Systems.

[17]  Michael R. Lyu Software Reliability Engineering: A Roadmap , 2007, Future of Software Engineering (FOSE '07).

[18]  Johann Schumann,et al.  The Case for Software Health Management , 2011, 2011 IEEE Fourth International Conference on Space Mission Challenges for Information Technology.

[19]  Brian C. Williams,et al.  Model-Based Programming of Fault-Aware Systems , 2004, AI Mag..

[20]  Torres Wilfredo,et al.  Software Fault Tolerance: A Tutorial , 2000 .

[21]  Ricky W. Butler A Primer on Architectural Level Fault Tolerance , 2008 .

[22]  Laura L. Pullum,et al.  Software Fault Tolerance Techniques and Implementation , 2001 .

[23]  Richard N. Taylor,et al.  Towards architecture-based self-healing systems , 2002, WOSS '02.