Two methods for computing bounds for the distribution of cumulative reward for large Markov models

Degradable fault-tolerant systems can be evaluated using rewarded continuous-time Markov chain (CTMC) models. In that context, a useful measure to consider is the distribution of the cumulative reward over a time interval [0, t]. All currently available numerical methods for computing that measure tend to be very expensive when the product of the maximum output rate of the CTMC model and t is large and, in that case, their application is limited to CTMC models of moderate size. In this paper, we develop two methods for computing bounds for the cumulative reward distribution of CTMC models with reward rates associated with states: BT/RT (Bounding Transformation/Regenerative Transformation) and BT/BRT (Bounding Transformation/Bounding Regenerative Transformation). The methods require the selection of a regenerative state, are numerically stable and compute the bounds with well-controlled error. For a class of rewarded CTMC models, class C′″1, and a particular, natural selection for the regenerative state the BT/BRT method allows us to trade off bound tightness with computational cost and will provide bounds at a moderate computational cost in many cases of interest. For a class of models, class C1″ slightly wider than class C′″1, and a particular, natural selection for the regenerative state, the BT/RT method will yield tighter bounds at a higher computational cost. Under additional conditions, the bounds obtained using the less expensive version of BT/BRT and BT/RT seem to be tight for any value of t or not small values of t, depending on the initial probability distribution of the model. Class C1″ and class C′″1 models with these additional conditions include both exact and bounding typical failure/repair performability models of fault-tolerant systems with exponential failure and repair time distributions and repair in every state with failed components and a reward rate structure which is a non-increasing function of the collection of failed components. We illustrate both the applicability and the performance of the methods using a large CTMC performability example of a fault-tolerant multiprocessor system.

[1]  Miklós Telek,et al.  Numerical Analysis of Large Markov Reward Models , 1999, Perform. Evaluation.

[2]  Sheldon M. Ross,et al.  Stochastic Processes , 2018, Gauge Integral Structures for Stochastic Calculus and Quantum Electrodynamics.

[3]  William H. Sanders,et al.  The Möbius Framework and Its Implementation , 2002, IEEE Trans. Software Eng..

[4]  L. Donatiello,et al.  On Evaluating the Cumulative Performance Distribution of Fault-Tolerant Computer Systems , 1991, IEEE Trans. Computers.

[5]  Hany H. Ammar,et al.  Performability of the hypercube (reliability) , 1989 .

[6]  Krishna R. Pattipati,et al.  A Unified Framework for the Performability Evaluation of Fault-Tolerant Computer Systems , 1993, IEEE Trans. Computers.

[7]  Kai Lai Chung,et al.  Markov Chains with Stationary Transition Probabilities , 1961 .

[8]  Bruno Sericola,et al.  Performability analysis of fault-tolerant computer systems , 1994 .

[9]  Miklós Telek,et al.  MRMSolve: Distribution Estimation of Large Markov Reward Models , 2002, Computer Performance Evaluation / TOOLS.

[10]  Jean Arlat,et al.  SURF-2: A program for dependability evaluation of complex hardware and software systems , 1993, FTCS-23 The Twenty-Third International Symposium on Fault-Tolerant Computing.

[11]  William H. Sanders,et al.  Reward Model Solution Methods with Impulse and Rate Rewards: An Algorithm and Numerical Results , 1994, Perform. Evaluation.

[12]  Kishor S. Trivedi,et al.  Performability Analysis: Measures, an Algorithm, and a Case Study , 1988, IEEE Trans. Computers.

[13]  Edmundo de Souza e Silva,et al.  Calculating transient distributions of cumulative reward , 1995, SIGMETRICS '95/PERFORMANCE '95.

[14]  Gianfranco Ciardo,et al.  SMART: simulation and Markovian analyzer for reliability and timing , 1996, Proceedings of IEEE International Computer Performance and Dependability Symposium.

[15]  Edmundo de Souza e Silva,et al.  Calculating availability and performability measures of repairable computer systems using randomization , 1989, JACM.

[16]  William H. Sanders,et al.  Performance evaluation: Erratum to “Reward model solution methods with impulse and rate rewards: an algorithm and numerical results” [20 (1994) 413–436] , 1995 .

[17]  E. D. S. E. Silva,et al.  An algorithm to calculate transient distributions of cumulative rate and impulse based reward , 1998 .

[18]  Bruno Sericola,et al.  Performability Analysis: A New Algorithm , 1996, IEEE Trans. Computers.

[19]  M.A. Qureshi,et al.  The UltraSAN Modeling Environment , 1995, Perform. Evaluation.

[20]  John F. Meyer,et al.  Performability Evaluation of the SIFT Computer , 1980, IEEE Transactions on Computers.

[21]  Bruno Sericola,et al.  Performability analysis for degradable computer systems , 2000 .

[22]  Giovanni Chiola,et al.  GreatSPN 1.7: Graphical Editor and Analyzer for Timed and Stochastic Petri Nets , 1995, Perform. Evaluation.

[23]  John F. Meyer,et al.  On Evaluating the Performability of Degradable Computing Systems , 1980, IEEE Transactions on Computers.

[24]  Gerardo Rubino,et al.  Interval Availability Analysis Using Denumerable Markov Processes: Application to Multiprocessor Subject to Breakdowns and Repair , 1995, IEEE Trans. Computers.

[25]  Oliver C. Ibe,et al.  Markov processes for stochastic modeling , 2008 .

[26]  Sadahiro Saeki A PROOF OF THE EXISTENCE OF INFINITE PRODUCT PROBABILITY MEASURES , 1996 .

[27]  William H. Sanders,et al.  A new methodology for calculating distributions of reward accumulated during a finite interval , 1996, Proceedings of Annual Symposium on Fault Tolerant Computing.

[28]  John F. Meyer Performability of an Algorithm for Connection Admission Control , 2001, IEEE Trans. Computers.

[29]  Gerald B. Folland,et al.  Real Analysis: Modern Techniques and Their Applications , 1984 .

[30]  Tsutomu Sasao,et al.  Average an Worst Case Number of Nodes in Decision Diagrams of Symmetric Multiple-Valued Functions , 1997, IEEE Trans. Computers.

[31]  Kishor S. Trivedi,et al.  SPNP: The Stochastic Petri Net Package (Version 3.1) , 1993, MASCOTS.

[32]  Juan A. Carrasco,et al.  Solving large interval availability models using a model transformation approach , 2004, Comput. Oper. Res..

[33]  Hédi Nabli Performability measure for acyclic Markovian models , 1998 .