Evaluation of fault-tolerant systems with nonhomogeneous workloads

A methodology is presented for evaluating fault-tolerant systems when workloads and fault arrivals are not time-homogeneous. Of particular interests are systems whose environments vary considerably between different utilization phases of random duration. In such cases, evaluations of overall system performability must account for the corresponding differences in workload effects, especially with regard to fault recovery. The proposed methodology uses analytic techniques based on Markov processes and stochastic activity networks. Examples of evaluation studies, using this approach, are presented. These include evaluation of a system wherein self-exercising is varied between phases of passive and active use.<<ETX>>

[1]  John F. Meyer,et al.  Performability Evaluation of the SIFT Computer , 1980, IEEE Transactions on Computers.

[2]  Mansoor Alam,et al.  Quantitative Reliability Evaluation of Repairable Phased-Mission Systems Using Markov Approach , 1986, IEEE Transactions on Reliability.

[3]  William H. Sanders,et al.  METASAN: A Performability Evaluation Tool Based on Stochastic Acitivity Networks , 1986, FJCC.

[4]  Kishor S. Trivedi,et al.  Performability Modeling Based on Real Data: A Case Study , 1988, IEEE Trans. Computers.

[5]  William H. Sanders,et al.  Construction and solution of performability models based on stochastic activity networks , 1988 .

[6]  Daniel P. Siewiorek,et al.  Workload, Performance, and Reliability of Digital Computing Systems. , 1980 .

[7]  Ravishankar K. Iyer,et al.  A Measurement-Based Model for Workload Dependence of CPU Errors , 1986, IEEE Transactions on Computers.

[8]  Ram Chillarege,et al.  The effect of system workload on error latency: an experimental study , 1985, SIGMETRICS 1985.

[9]  William H. Sanders,et al.  Performance Variable Driven Construction Methods for Stochastic Activity Networks , 1987, Computer Performance and Reliability.

[10]  Kang G. Shin,et al.  Fault Injection Techniques and Tools , 1997, Computer.

[11]  John F. Meyer,et al.  Closed-Form Solutions of Performability , 1982, IEEE Transactions on Computers.

[12]  William H. Sanders,et al.  Stochastic Activity Networks: Structure, Behavior, and Application , 1985, PNPM.

[13]  Lu Wei,et al.  Analysis of workload influence on dependability , 1988, [1988] The Eighteenth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[14]  John F. Meyer,et al.  Fault-tolerant BIBD networks , 1988, [1988] The Eighteenth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[15]  Lu Wei,et al.  Influence of Workload on Error Recovery in Random Access Memories , 1988, IEEE Trans. Computers.

[16]  Daniel P. Siewiorek,et al.  A Performance-Reliability Model for Computing Systems, , 1980 .

[17]  Ravishankar K. Iyer,et al.  Measurement and modeling of computer reliability as affected by system activity , 1986, TOCS.

[18]  John F. Meyer,et al.  On Evaluating the Performability of Degradable Computing Systems , 1980, IEEE Transactions on Computers.

[19]  John F. Meyer,et al.  Phased models for evaluating the performability of computing systems , 1979 .

[20]  Ravishankar K. Iyer,et al.  Fault latency in the memory - An experimental study on VAX 11/780 , 1986 .