Integrated Design-Stage Failure Analysis of Software-Driven Hardware Systems

Software-driven hardware configurations account for the majority of modern safety-critical complex systems. The often costly failures of such systems can be attributed to software specific, hardware specific, or software/hardware interaction failures. The understanding of how failures propagate in such complex systems might provide critical information to designers, because, while a software component may not fail in terms of loss of function, a software operational state can cause an associated hardware failure. The least expensive phase of the product life cycle to address failures is during the design stage. This research presents a means to evaluate how a combined software/hardware system behaves and how such failures propagate to result in potential failures downstream, during the conceptual design stage. In particular, this paper proposes the use of high-level system modeling and model-based reasoning approaches to model failure propagation in combined software-hardware systems, introducing the Function-Failure Identification and Propagation (FFIP) analysis framework to help formalize the design of safety-critical systems. The fact that the hardware and software designers do not share the same background, knowledge, methods, or language contributes significantly to software/hardware interaction failures. A high-level systems analysis method, such as FFIP, is geared toward the unification of language and modeling concepts and may help to more seamlessly bridge such a gap. The technique is applied to the design of the Reaction Control System Jet Selection of the NASA space shuttle to evaluate failure propagation within the Reaction Control System Jet selection, specifically for the redundancy management system. The paper concludes with the extensions and mappings to the software domain that are required for a truly integrated methodology.

[1]  D. N. P. Murthy,et al.  Reliability: Modeling, Prediction, and Optimization , 2000 .

[2]  Marc Hansen,et al.  Report of the Defense Science Board Task Force on Defense Software , 2000 .

[3]  Matthew I. Campbell,et al.  An evaluation scheme for assessing the worth of automatically generated design alternatives , 2009 .

[4]  Donald Caughlin Integration of object-oriented and functional modeling and design methods , 1997, Defense, Security, and Sensing.

[5]  Ravi Kapadia SymCure: A Model-Based Approach for Fault Management with Causal Directed Graphs , 2003, IEA/AIE.

[6]  Maxym Sjachyn Semantic component selection , 2009 .

[7]  Irem Y. Tumer,et al.  Health Management Allocation During Conceptual System Design , 2009, J. Comput. Inf. Sci. Eng..

[8]  Thomas R. Gruber,et al.  Model-based virtual document generation , 1997, Int. J. Hum. Comput. Stud..

[9]  Kevin Otto,et al.  Product Design: Techniques in Reverse Engineering and New Product Development , 2000 .

[10]  Ming Li,et al.  Study of the impact of hardware fault on software reliability , 2005, 16th IEEE International Symposium on Software Reliability Engineering (ISSRE'05).

[11]  Jie Chen,et al.  Robust Model-Based Fault Diagnosis for Dynamic Systems , 1998, The International Series on Asian Studies in Computer and Information Science.

[12]  Daniel A. McAdams,et al.  A Component Taxonomy as a Framework for Computational Design Synthesis , 2009, J. Comput. Inf. Sci. Eng..

[13]  Bin Li,et al.  Software Reliability Models , 2002 .

[14]  Ljerka Beus-Dukic,et al.  Semantic component selection - SemaCS , 2006, Fifth International Conference on Commercial-off-the-Shelf (COTS)-Based Software Systems (ICCBSS'05).

[15]  Nancy G. Leveson,et al.  Role of Software in Spacecraft Accidents , 2004 .

[16]  Irem Y. Tumer,et al.  Flow State Logic (FSL) for Analysis of Failure Propagation in Early Design , 2009 .

[17]  George E. Apostolakis,et al.  Probabilistic Risk Assessment (PRA) , 2008 .

[18]  Gary Riley,et al.  Expert Systems: Principles and Programming , 2004 .

[19]  B. H. C. Cheng,et al.  Formalizing the Functional Model within Object-Oriented Design , 2000, Int. J. Softw. Eng. Knowl. Eng..

[20]  Irem Y. Tumer,et al.  A Graph-Based Fault Identification and Propagation Framework for Functional Design of Complex Systems , 2008 .

[21]  Carol S. Smidts,et al.  An architectural model for software reliability quantification , 1997, Proceedings The Eighth International Symposium on Software Reliability Engineering.

[22]  Carol S. Smidts,et al.  A framework to integrate software behavior into dynamic probabilistic risk assessment , 2007, Reliab. Eng. Syst. Saf..

[23]  William B. Frakes,et al.  Software reuse research: status and future , 2005, IEEE Transactions on Software Engineering.

[24]  Lidia Fuentes-Fernández,et al.  An Introduction to UML Profiles , 2004 .

[25]  Paola Velardi,et al.  Hardware-Related Software Errors: Measurement and Analysis , 1985, IEEE Transactions on Software Engineering.

[26]  Sherif Abdelwahed System Diagnosis using Hybrid Failure Propagation Graphs , 2004 .

[27]  Jeffrey S. Lavell,et al.  Report on the Loss of the Mars Polar Lander and Deep Space 2 Missions , 2000 .

[28]  Robert M. MacGregor,et al.  Building and (re)using an ontology of air campaign planning , 1999, IEEE Intell. Syst..

[29]  Ivan J. Sacks Digraph Matrix Analysis , 1985, IEEE Transactions on Reliability.

[30]  Barton P. Miller,et al.  Diagnosing Distributed Systems with Self-propelled Instrumentation , 2008, Middleware.

[31]  Simon Szykman,et al.  A functional basis for engineering design: Reconciling and evolving previous efforts , 2002 .

[32]  Homayoon Dezfuli,et al.  Probabilistic Risk Assessment Procedures Guide for NASA Managers and Practitioners (Second Edition) , 2011 .

[33]  Brian C. Williams,et al.  Diagnosing Multiple Faults , 1987, Artif. Intell..

[34]  Takashi Nanya Fault-Tolerance Techniques in Distributed Systems , 1992 .

[35]  P. Pandurang Nayak,et al.  A Model-Based Approach to Reactive Self-Configuring Systems , 1996, AAAI/IAAI, Vol. 2.

[36]  Arie van Deursen,et al.  Domain-specific languages: an annotated bibliography , 2000, SIGP.

[37]  Irem Y. Tumer,et al.  A Function-Based Methodology for Analyzing Critical Events , 2006, DAC 2006.

[38]  Michael R. Lyu Software Reliability Engineering: A Roadmap , 2007, Future of Software Engineering (FOSE '07).

[39]  P. Pandurang Nayak,et al.  Back to the Future for Consistency-Based Trajectory Tracking , 2000, AAAI/IAAI.

[40]  Yoshikiyo Kato,et al.  Fault Detection by Mining Association Rules from House-keeping Data , 2001 .

[41]  Krishna R. Pattipati,et al.  Multi-signal flow graphs: a novel approach for system testability analysis and fault diagnosis , 1994 .

[42]  Irem Y. Tumer,et al.  Modeling the Propagation of Failures in Software Driven Hardware Systems to Enable Risk-Informed Design , 2008 .

[43]  Karama Kanoun,et al.  Fault-tolerant system dependability-explicit modeling of hardware and software component-interactions , 2000, IEEE Trans. Reliab..

[44]  Irem Y. Tumer,et al.  Risk-Based Decision-Making for Managing Resources during the Design of Complex Aerospace Systems , 2005 .

[45]  Daniel R. Jeske,et al.  Reliability Modeling of Hardware and Software Interactions, and Its Applications , 2006, IEEE Transactions on Reliability.

[46]  Ivar Jacobson,et al.  The Unified Modeling Language User Guide , 1998, J. Database Manag..

[47]  Bin Li,et al.  Integrating software into PRA , 2003, 14th International Symposium on Software Reliability Engineering, 2003. ISSRE 2003..

[48]  Robert F Nesbit Report of the Defense Science Board Task Force on DoD Supercomputing Needs , 2000 .

[49]  Kristin L. Wood,et al.  Using quantitative functional models to develop product architectures , 2000 .

[50]  Irem Y. Tumer,et al.  Risk-Based Decision-Making for Managing Resources During the Design of Complex Space Exploration Systems , 2006 .

[51]  Irem Y. Tumer,et al.  Function-Based Design of a Spacecraft Power System Diagnostics Testbed , 2005 .

[52]  A.P. Mathur,et al.  Software and hardware quality assurance: towards a common platform for high reliability , 1990, IEEE International Conference on Communications, Including Supercomm Technical Sessions.

[53]  Kristin L. Wood,et al.  Development of a Functional Basis for Design , 2000 .

[54]  Carol-Sophie Smidts,et al.  Software reliability modeling: an approach to early reliability prediction , 1998 .

[55]  J.-L. Chen,et al.  An object-oriented dependency graph for program slicing , 1997, Proceedings. Technology of Object-Oriented Languages. TOOLS 24 (Cat. No.97TB100240).

[56]  Irem Y. Tumer,et al.  A Risk-Informed Decision Making Methodology for Evaluating Failure Impact of Early System Designs , 2008 .

[57]  Benjamin Kuipers,et al.  Model-Based Monitoring of Dynamic Systems , 1989, IJCAI.

[58]  Irem Y. Tumer,et al.  The function-failure design method , 2005 .