Safety kernal enforcement of software safety policies

Computing systems in which the consequences of failure are very serious are termed safety-critical. Many such systems exist in application areas such as aerospace, defense, transportation, power-generation, and medicine. The software in these systems is typically large and complex, critical to system safety, and difficult to implement and verify. Even when great effort is expended to develop the software, there is no assurance that the software will operate with the required level of dependability. We have investigated a safety kernel architecture that addresses part of the problem of building and verifying dependable safety-critical software. An analogous construct, the security kernel, has been used successfully to enforce security policies in classified-information systems. Similar requirements known as safety policies must be enforced in safety-critical systems. Other researchers have developed some basic safety kernel concepts and have proposed safety kernel designs. However, many feasibility issues have not been addressed previously. Thus, the focus of this research has been the evaluation and development of the safety kernel as a software architecture for enforcement of safety policies. We have evaluated the feasibility of the safety kernel in four areas: policy enforcement, reliable enforcement, implementation, and verification. The first area addresses the role of the safety kernel and assesses its support for safety-critical systems. The second, area examines the requirements for reliable policy enforcement by the safety kernel. The third area focuses on the feasibility of a reuse-oriented implementation strategy. The fourth area considers the verification or the safety kernel. Work in each of these areas has been supported by our involvement with two case studies: the Magnetic Stereotaxis System and the University of Virginia Reactor. The results presented in this dissertation demonstrate that it is feasible for the safety kernel to enforce a significant set of safety policies--policies that are directly related to device operation. Furthermore, operating in the system context, it can enforce policies reliably in spite of certain component failures. We demonstrate that a special-purpose specification language can be used to describe the safety kernel and that a source code representation of the safety kernel can be mechanically generated from this policy specification. Finally, we define the issues in verification of the safety kernel and demonstrate the feasibility of several analysis and testing techniques.

[1]  Nancy G. Leveson Software Fault Tolerance in Safety-Critical Applications , 1987, Fehlertolerierende Rechensysteme.

[2]  John A. McDermid,et al.  Policies for Safety-Critical Systems: the Challenge of Formalisation , 1994 .

[3]  Liming Chen,et al.  N-VERSION PROGRAMMINC: A FAULT-TOLERANCE APPROACH TO RELlABlLlTY OF SOFTWARE OPERATlON , 1995, Twenty-Fifth International Symposium on Fault-Tolerant Computing, 1995, ' Highlights from Twenty-Five Years'..

[4]  Mary Ellen Zurko,et al.  A Retrospective on the VAX VMM Security Kernel , 1991, IEEE Trans. Software Eng..

[5]  Jim Lipkis,et al.  Architectural issues in microkernel-based operating systems: the CHORUS experience , 1991, Comput. Commun..

[6]  R GarmanJohn The "BUG" heard 'round the world , 1981 .

[7]  Nancy G. Leveson,et al.  An experimental evaluation of the assumption of independence in multiversion programming , 1986, IEEE Transactions on Software Engineering.

[8]  Dave E. Eckhardt,et al.  Fundamental differences in the reliability of N-modular redundancy and N-version programming , 1988, J. Syst. Softw..

[9]  Robert S. Swarz,et al.  The theory and practice of reliable system design , 1982 .

[10]  Algirdas Avizienis,et al.  The N-Version Approach to Fault-Tolerant Software , 1985, IEEE Transactions on Software Engineering.

[11]  J. C. Higgs A High Integrity Software Based Turbine Governing System , 1983 .

[12]  Russell H. Taylor,et al.  Augmentation of human precision in computer-integrated surgery , 1992 .

[13]  W.N. Toy,et al.  Fault-tolerant design of local ESS processors , 1978, Proceedings of the IEEE.

[14]  Douglas R. Miller Making statistical inferences about software reliability , 1986 .

[15]  Jürgen Nehmer,et al.  Operating Systems of the 90s and Beyond , 1991, Lecture Notes in Computer Science.

[16]  S. L. Gerhart,et al.  Toward a theory of test data selection , 1975, IEEE Transactions on Software Engineering.

[17]  Nancy G Leveson,et al.  Software safety: why, what, and how , 1986, CSUR.

[18]  Nancy G. Leveson,et al.  Design for safe software , 1983 .

[19]  S S Brilliant,et al.  The consistent comparison problem in N-version software , 1987, SOEN.

[20]  M A Howard,et al.  Preliminary experimental investigation of in vivo magnetic manipulation: results and potential application in hyperthermia. , 1989, Medical physics.

[21]  Robbert van Renesse,et al.  Experiences with the Amoeba distributed operating system , 1990, CACM.

[22]  G. B. Finelli,et al.  The Infeasibility of Quantifying the Reliability of Life-Critical Real-Time Software , 1993, IEEE Trans. Software Eng..

[23]  James P. Black,et al.  Redundancy in Data Structures: Improving Software Fault Tolerance , 1980, IEEE Transactions on Software Engineering.

[24]  Jean-Claude Laprie The Dependability Approach to Critical Computing Systems , 1987, ESEC.

[25]  Mark Kenneth Joseph Architectural issues in fault-tolerant, secure computing systems , 1988 .

[26]  T. Anderson Kernels for Safety ? , 1989 .

[27]  Hermann Kopetz,et al.  Fault tolerance, principles and practice , 1990 .

[28]  Brian Randell,et al.  A Distributed Secure System , 1983, 1983 IEEE Symposium on Security and Privacy.

[29]  Dave E. Eckhardt,et al.  A Theoretical Basis for the Analysis of Multiversion Software Subject to Coincident Errors , 1985, IEEE Transactions on Software Engineering.

[30]  Aaron G. Cass,et al.  Testing a safety-critical application , 1994, ISSTA '94.

[31]  Roy H. Campbell,et al.  Designing and implementing Choices: an object-oriented system in C++ , 1993, CACM.

[32]  Hermann Kopetz,et al.  Event-Triggered Versus Time-Triggered Real-Time Systems , 1991, Operating Systems of the 90s and Beyond.

[33]  Thomas Anderson Safe and Secure Computing Systems , 1989 .

[34]  Flaviu Cristian Basic Concepts and Issues in Fault-Tolerant Distributed Systems , 1991, Operating Systems of the 90s and Beyond.

[35]  Paul Ammann,et al.  The Effect of Imperfect Error Detection on Reliability Assessment via Life Testing , 1992, IEEE Trans. Software Eng..

[36]  Rubén Prieto-Díaz,et al.  Status report: software reusability , 1993, IEEE Software.

[37]  J. C. Knight,et al.  Safety-Critical Computer Applications: The Role of Software Engineering , 1992 .

[38]  L. J. Fraim Scomp: A Solution to the Multilevel Security Problem , 1983, Computer.

[39]  Peter G. Neumann,et al.  On hierarchical design of computer systems for critical applications , 1986, IEEE Transactions on Software Engineering.

[40]  William C. Broaddus,et al.  MAGNETIC MANIPULATION INSTRUMENTATION FOR MEDICAL PHYSICS RESEARCH , 1994 .

[41]  E. A. Addy A case study on isolation of safety-critical software , 1991, COMPASS '91, Proceedings of the Sixth Annual Conference on Computer Assurance.

[42]  Nancy G. Leveson,et al.  An investigation of the Therac-25 accidents , 1993, Computer.

[43]  J. Zuhars,et al.  Taming the bull: safety in a precise surgical robot , 1991, Fifth International Conference on Advanced Robotics 'Robots in Unstructured Environments.

[44]  D. L. Parnas,et al.  On the criteria to be used in decomposing systems into modules , 1972, Software Pioneers.

[45]  Morrie Gasser,et al.  Security Kernel Design and Implementation: An Introduction , 1983, Computer.