Conservative Confidence Bounds in Safety, from Generalised Claims of Improvement & Statistical Evidence

“Proven-in-use”, “globally-at-least-equivalent”, “stress-tested”, are concepts that come up in diverse contexts in acceptance, certification or licensing of critical systems. Their common feature is that dependability claims for a system in a certain operational environment are supported, in part, by evidence – viz of successful operation – concerning different, though related, system[s] and/or environment[s], together with an auxiliary argument that the target system/environment offers the same, or improved, safety. We propose a formal probabilistic (Bayesian) organisation for these arguments. Through specific examples of evidence for the “improvement” argument above, we demonstrate scenarios in which formalising such arguments substantially increases confidence in the target system, and show why this is not always the case. Example scenarios concern vehicles and nuclear plants. Besides supporting stronger claims, the mathematical formalisation imposes precise statements of the bases for “improvement” claims: seemingly similar forms of prior beliefs are sometimes revealed to imply substantial differences in the claims they can support.

[1]  Lorenzo Strigini,et al.  Assessing the Safety and Reliability of Autonomous Vehicles from Road Testing , 2019, 2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE).

[2]  G. B. Finelli,et al.  The Infeasibility of Quantifying the Reliability of Life-Critical Real-Time Software , 1993, IEEE Trans. Software Eng..

[3]  Lorenzo Strigini,et al.  Estimating Bounds on the Reliability of Diverse Systems , 2003, IEEE Trans. Software Eng..

[4]  Peter T. Popov Bayesian reliability assessment of legacy safety-critical systems upgraded with fault-tolerant off-the-shelf software , 2013, Reliab. Eng. Syst. Saf..

[5]  Nidhi Kalra,et al.  Driving to Safety , 2016 .

[6]  David Wright,et al.  Modeling the probability of failure on demand (pfd) of a 1-out-of-2 system in which one channel is "quasi-perfect" , 2017, Reliab. Eng. Syst. Saf..

[7]  Bev Littlewood,et al.  Validation of ultrahigh dependability for software-based systems , 1993, CACM.

[8]  Bev Littlewood,et al.  Reasoning about the Reliability of Diverse Two-Channel Systems in Which One Channel Is "Possibly Perfect" , 2012, IEEE Transactions on Software Engineering.

[9]  David Wright,et al.  Conservative claims about the probability of perfection of software-based systems , 2015, 2015 IEEE 26th International Symposium on Software Reliability Engineering (ISSRE).

[10]  James O. Berger,et al.  An overview of robust Bayesian analysis , 1994 .

[11]  David Wright,et al.  Toward a Formalism for Conservative Claims about the Dependability of Software-Based Systems , 2011, IEEE Transactions on Software Engineering.

[12]  Bev Littlewood How to Measure Software Reliability and How Not To , 1979, IEEE Transactions on Reliability.

[13]  Lorenzo Strigini,et al.  Software Fault-Freeness and Reliability Predictions , 2013, SAFECOMP.

[14]  Kizito Salako Loss-Size and Reliability Trade-Offs Amongst Diverse Redundant Binary Classifiers , 2020, QEST.

[15]  Michael R. Lyu,et al.  What is software reliability? , 1994, Proceedings of COMPASS'94 - 1994 IEEE 9th Annual Conference on Computer Assurance.

[16]  Bev Littlewood,et al.  On reliability assessment when a software-based system is replaced by a thought-to-be-better one , 2020, Reliab. Eng. Syst. Saf..

[17]  David Wright,et al.  Some Conservative Stopping Rules for the Operational Testing of Safety-Critical Software , 1997, IEEE Trans. Software Eng..

[18]  Lorenzo Strigini,et al.  Assessing Safety-Critical Systems from Operational Testing: A Study on Autonomous Vehicles , 2020, Inf. Softw. Technol..

[19]  Bev Littlewood,et al.  "Validation of ultra-high dependability…" – 20 years on , 2011 .

[20]  Proofs of Conservative Confidence Bounds on PFD, Using Claims of Improved Reliability , 2021 .