Error Masking: A Source of Failure Dependency in Multi-Version Programs

This paper presents some empirical measurements of failure dependencies between the known faults detected in an earlier software diversity experiment (PODS). The results showed that some apparently unrelated pairs of faults had high (and very similar) levels of dependency. This has been explained in terms of a error masking process. It is shown that this process is likely to occur in many software applications, including the missile launcher application used in the Knight and Leveson experiment. Error masking behaviour can be predicted from the specification (prior to implementation), and simple modifications to the program design can minimize the error masking effect and hence the observed dependency.

[1]  Michael R. Lyu,et al.  In search of effective diversity: a six-language study of fault-tolerant flight control software , 1988, [1988] The Eighteenth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[2]  Nancy G. Leveson,et al.  An experimental evaluation of the assumption of independence in multiversion programming , 1986, IEEE Transactions on Software Engineering.

[3]  Peter G. Bishop,et al.  PODS — A project on diverse software , 1986, IEEE Transactions on Software Engineering.

[4]  David F. McAllister,et al.  A large scale second generation experiment in multi-version software: description and early results , 1988, [1988] The Eighteenth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[5]  Algirdas Avižienis Fault-tolerance and fault-intolerance: Complementary approaches to reliable computing , 1975 .

[6]  Dave E. Eckhardt,et al.  A Theoretical Basis for the Analysis of Multiversion Software Subject to Coincident Errors , 1985, IEEE Transactions on Software Engineering.

[7]  Bev Littlewood,et al.  A Conceptual Model of the Effect of Diverse Methodologies on Coincident Failures in Multi-Version Software , 1987, Fehlertolerierende Rechensysteme.