Preventive maintenance of operational software systems is a technique used to counteract the phenomenon of software "aging". Haung et al. (1995) proposed a technique called "software rejuvenation" in which the software is periodically stopped and then restarted in a "robust" state after a proper maintenance. This "renewal" of the software prevents, or at least postpones, the occurrence of crash failures. Previous models of software rejuvenation were all based on a "black box" approach in which the degradation mechanism was modeled by three stares: a fully available state, a degraded state from which the decision whether to rejuvenate can be taken, and a crash state. The present paper proposes a fine grained model for the quantitative analysis of software rejuvenation. The model is based on the assumption that it is possible to identify the current degradation level of the system by monitoring an observable quantity, so that the future strategy can be tuned on the measured parameter. Two different strategies are discussed to decide whether and when to rejuvenate. Furthermore, resorting to the theory of renewal processes with reward the steady-state unavailability can be estimated for the various policies and an optimality criterion can be invoked to evaluate the proper rejuvenation intervals. A set of numerical experiments conclude the paper.
[1]
Kishor S. Trivedi.
Probability and Statistics with Reliability, Queuing, and Computer Science Applications
,
1984
.
[2]
Kishor S. Trivedi,et al.
Modeling and Analysis of Load and Time Dependent Software Rejuvenation Policies
,
1996
.
[3]
Kishor S. Trivedi,et al.
Analysis of software rejuvenation using Markov Regenerative Stochastic Petri Net
,
1995,
Proceedings of Sixth International Symposium on Software Reliability Engineering. ISSRE'95.
[4]
A. W. Marshall,et al.
Shock Models and Wear Processes
,
1973
.
[5]
Yennun Huang,et al.
Software rejuvenation: analysis, module and applications
,
1995,
Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers.
[6]
E Marshall,et al.
Fatal error: how patriot overlooked a scud.
,
1992,
Science.
[7]
Sachin Garg,et al.
Towards Performability Modeling of Software Rejuvenation
,
1996
.
[8]
Terry Williams,et al.
Probability and Statistics with Reliability, Queueing and Computer Science Applications
,
1983
.
[9]
Kishor S. Trivedi,et al.
Analysis of Preventive Maintenance in Transactions Based Software Systems
,
1998,
IEEE Trans. Computers.
[10]
J. Ben Atkinson,et al.
Modeling and Analysis of Stochastic Systems
,
1996
.