Computationally Efficient and Numerically Stable Reliability Bounds for Repairable Fault-Tolerant Systems

The transient analysis of large continuous time Markov reliability models of repairable fault-tolerant systems is computationally expensive due to model stiffness. We develop and analyze a method to compute bounds for a measure defined on a particular, but quite wide class of continuous time Markov models, encompassing both exact and bounding continuous time Markov reliability models of fault-tolerant systems. The method is numerically stable and computes the bounds with well-controlled and specifiable-in-advance error. Computational effort can be traded off with bounds accuracy. For a class of continuous time Markov models, class C", including typical failure/repair reliability models with exponential failure and repair time distributions and repair in every state with failed components, the method can yield reasonably tight bounds at a very small computational cost. The method builds upon a recently proposed numerical method for the transient analysis of continuous time Markov models called regenerative randomization.

[1]  Barry W. Johnson Design & analysis of fault tolerant digital systems , 1988 .

[2]  Juan A. Carrasco,et al.  Computation of bounds for transient measures of large rewarded Markov models using regenerative randomization , 2003, Comput. Oper. Res..

[3]  Peter W. Glynn,et al.  Computing Poisson probabilities , 1988, CACM.

[4]  L. Knüsel Computation of the chi square and Poisson distribution , 1986 .

[5]  Sheldon M. Ross,et al.  Stochastic Processes , 2018, Gauge Integral Structures for Stochastic Calculus and Quantum Electrodynamics.

[6]  A. Bobbio,et al.  A benchmark for ph estimation algorithms: results for acyclic-ph , 1994 .

[7]  Bruno Sericola Availability Analysis of Repairable Computer Systems and Stationarity Detection , 1999, IEEE Trans. Computers.

[8]  Kishor S. Trivedi,et al.  Numerical transient analysis of markov models , 1988, Comput. Oper. Res..

[9]  D. R. Miller Reliability calculation using randomization for Markovian fault-tolerant computing systems , 1982 .

[10]  R. Ramaswami,et al.  Book Review: Design and Analysis of Fault-Tolerant Digital Systems , 1990 .

[11]  E. Scheuer,et al.  Calculation of the Poisson cumulative distribution function (reliability applications) , 1990 .

[12]  William H. Sanders,et al.  Transient solution of Markov models by combining adaptive and standard uniformization , 1997 .

[13]  Kishor S. Trivedi,et al.  STIFFNESS-TOLERANT METHODS FOR TRANSIENT ANALYSIS OF STIFF MARKOV CHAINS , 1994 .

[14]  Marcel F. Neuts,et al.  Matrix-geometric solutions in stochastic models - an algorithmic approach , 1982 .

[15]  Manish Malhotra,et al.  A Computationally Efficient Technique for Transient Analysis of Repairable Markovian Systems , 1996, Perform. Evaluation.

[16]  M. R. Spiegel Mathematical handbook of formulas and tables , 1968 .

[17]  Kishor S. Trivedi,et al.  An Aggregation Technique for the Transient Analysis of Stiff Markov Chains , 1986, IEEE Transactions on Computers.

[18]  M. Abramowitz,et al.  Handbook of Mathematical Functions With Formulas, Graphs and Mathematical Tables (National Bureau of Standards Applied Mathematics Series No. 55) , 1965 .

[19]  Micha Yadin,et al.  Randomization Procedures in the Computation of Cumulative-Time Distributions over Discrete State Markov Processes , 1984, Oper. Res..