In this paper, a methodology for determining and characterizing error latency is developed. The method is based on real workload data, gathered by an experiment instrumented on a VAX 11/780 during the normal workload cycle of the installation. This is the first attempt at jointly studying error latency and workload variations in a full production system. Distributions of error latency were generated by simulating the occurrence of faults under varying workload conditions. A family of error latency distributions so generated illustrate that error latency is not so much a function of when in time a fault occurred but rather a function of the workload that followed the failure. The study finds that the mean error latency varies by a 1 to 8 (hours) ratio between high and low workloads. The method is general and can be applied to any system.
[1]
Mario Lucio Cortes,et al.
Device Failures and System Activity: A Thermal Effects Model,
,
1984
.
[2]
J. Mcgough,et al.
Measurement of fault latency in a digital avionic mini processor, part 2
,
1983
.
[3]
Daniel P. Siewiorek,et al.
A Performance-Reliability Model for Computing Systems,
,
1980
.
[4]
Ravishankar K. Iyer,et al.
A Statistical Failure/Load Relationship: Results of a Multicomputer Study
,
1982,
IEEE Transactions on Computers.
[5]
Daniel P. Siewiorek,et al.
Workload, Performance, and Reliability of Digital Computing Systems.
,
1980
.