Analysis of a two-level software rejuvenation policy

A two-level rejuvenation policy for software systems with degradation process is studied. Both full restarts and partial restarts are considered in this rejuvenation strategy. A semi-Markov process model is constructed, and based on its closed-form solution we obtain the system availability as a bivariate function. Then, the rejuvenation policy is analyzed to maximize the system availability. Several different scenarios of software rejuvenation strategy are demonstrated by numerical examples.

[1]  E Marshall,et al.  Fatal error: how patriot overlooked a scud. , 1992, Science.

[2]  Ülkü Gürler,et al.  A maintenance policy for a system with multi-state components: an approximate solution , 2002, Reliab. Eng. Syst. Saf..

[3]  George Candea,et al.  Crash-Only Software , 2003, HotOS.

[4]  Tadashi Dohi,et al.  Estimating Software Rejuvenation Schedules in High-Assurance Systems , 2001, Comput. J..

[5]  Kishor S. Trivedi,et al.  On the Solution of GSPN Reward Models , 1991, Perform. Evaluation.

[6]  Matteo Sereno,et al.  Fine Grained Software Degradation Models for Optimal Rejuvenation Policies , 2001, Perform. Evaluation.

[7]  Daniel P. Siewiorek,et al.  High-availability computer systems , 1991, Computer.

[8]  Mark Sullivan,et al.  Software defects and their impact on system availability-a study of field failures in operating systems , 1991, [1991] Digest of Papers. Fault-Tolerant Computing: The Twenty-First International Symposium.

[9]  Kishor S. Trivedi Probability and Statistics with Reliability, Queuing, and Computer Science Applications , 1984 .

[10]  Kishor S. Trivedi,et al.  A methodology for detection and estimation of software aging , 1998, Proceedings Ninth International Symposium on Software Reliability Engineering (Cat. No.98TB100257).

[11]  Yennun Huang,et al.  Software rejuvenation: analysis, module and applications , 1995, Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[12]  Kishor S. Trivedi,et al.  Analysis of inspection-based preventive maintenance in operational software systems , 2002, 21st IEEE Symposium on Reliable Distributed Systems, 2002. Proceedings..

[13]  Kishor S. Trivedi,et al.  Proactive management of software aging , 2001, IBM J. Res. Dev..

[14]  Deron Liang,et al.  NT-SwiFT: software implemented fault tolerance on Windows NT , 2004, J. Syst. Softw..

[15]  Terry Williams,et al.  Probability and Statistics with Reliability, Queueing and Computer Science Applications , 1983 .

[16]  Richard E. Barlow,et al.  Statistical Theory of Reliability and Life Testing: Probability Models , 1976 .

[17]  Elaine J. Weyuker,et al.  Monitoring Smoothly Degrading Systems for Increased Dependability , 2004, Empirical Software Engineering.

[18]  J. Ben Atkinson,et al.  Modeling and Analysis of Stochastic Systems , 1996 .