Dynamic software rejuvenation policies in a transaction-based system under Markovian arrival processes

This paper presents a Markov decision process (MDP) formulation for a transaction-based system with software aging and rejuvenation. In our formulation, the arrival process of transactions is described as a Markovian arrival process (MAP). In addition, we introduce a probabilistically degrading processing rate to model the software aging. Furthermore, the paper focuses on two performance criteria to determine the optimal rejuvenation strategy: the long-run average reward and the power efficiency. Under these performance criteria, we formulate the optimality equations of MDPs for the maximization of the long-run average reward and power efficiency. Numerical experiments show that the optimal rejuvenation policy has the monotone property, and can be characterized by a threshold policy with the number of transactions through the sensitivity and statistical analysis using real traffic and aging data.

[1]  Edward N. Adams,et al.  Optimizing Preventive Service of Software Products , 1984, IBM J. Res. Dev..

[2]  Gábor Horváth,et al.  A minimal representation of Markov arrival processes and a moments matching method , 2007, Perform. Evaluation.

[3]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[4]  Tadashi Dohi,et al.  Estimating Markov-modulated compound Poisson processes , 2007, Valuetools 2007.

[5]  Tadashi Dohi,et al.  Comprehensive evaluation of aperiodic checkpointing and rejuvenation schemes in operational software system , 2010, J. Syst. Softw..

[6]  S. Asmussen,et al.  Marked point processes as limits of Markovian arrival streams , 1993 .

[7]  Katinka Wolter,et al.  Analysis of service availability for time-triggered rejuvenation policies , 2010, J. Syst. Softw..

[8]  Tadashi Dohi,et al.  Analysis of a Service Degradation Model with Preventive Rejuvenation , 2006, ISAS.

[9]  Tadashi Dohi,et al.  Performance Evaluation of Power-Aware Communication Network Devices , 2005, EUC.

[10]  M. Neuts,et al.  A single-server queue with server vacations and a class of non-renewal arrival processes , 1990, Advances in Applied Probability.

[11]  Kishor S. Trivedi,et al.  Optimal Software Rejuvenation for Tolerating Soft Failures , 1996, Perform. Evaluation.

[12]  Kishor S. Trivedi,et al.  Proactive management of software aging , 2001, IBM J. Res. Dev..

[13]  Kishor S. Trivedi,et al.  Analysis of Preventive Maintenance in Transactions Based Software Systems , 1998, IEEE Trans. Computers.

[14]  George E. Monahan,et al.  A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms , 2007 .

[15]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[16]  Walter Willinger,et al.  On the self-similar nature of Ethernet traffic , 1993, SIGCOMM '93.

[17]  Luca Benini,et al.  Dynamic power management for nonstationary service requests , 1999, Design, Automation and Test in Europe Conference and Exhibition, 1999. Proceedings (Cat. No. PR00078).

[18]  Elaine J. Weyuker,et al.  Monitoring Smoothly Degrading Systems for Increased Dependability , 2004, Empirical Software Engineering.

[19]  Tadashi Dohi,et al.  Performance-aware software rejuvenation strategies in a queueing system , 2010, 2010 IEEE Second International Workshop on Software Aging and Rejuvenation.

[20]  Tadashi Dohi,et al.  Dependability analysis of a client/server software system with rejuvenation , 2002, 13th International Symposium on Software Reliability Engineering, 2002. Proceedings..

[21]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[22]  Peter Buchholz,et al.  An EM-Algorithm for MAP Fitting from Real Traffic Data , 2003, Computer Performance Evaluation / TOOLS.

[23]  Kishor S. Trivedi,et al.  Optimal rejuvenation for tolerating soft failures , 1996 .

[24]  Dakai Zhu,et al.  On Maximizing Reliability of Real-Time Embedded Applications Under Hard Energy Constraint , 2010, IEEE Transactions on Industrial Informatics.

[25]  Kishor S. Trivedi,et al.  Analysis of Software Aging in a Web Server , 2006, IEEE Transactions on Reliability.

[26]  Hiroyuki Okamura,et al.  Performance Evaluation of Workload-Based Software Rejuvenation Scheme , 2001 .

[27]  Kishor S. Trivedi,et al.  A measurement-based model for estimation of resource exhaustion in operational software systems , 1999, Proceedings 10th International Symposium on Software Reliability Engineering (Cat. No.PR00443).

[28]  Wolf-Dietrich Weber,et al.  Power provisioning for a warehouse-sized computer , 2007, ISCA '07.

[29]  Kishor S. Trivedi,et al.  Fighting bugs: remove, retry, replicate, and rejuvenate , 2007, Computer.

[30]  Yennun Huang,et al.  Software rejuvenation: analysis, module and applications , 1995, Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[31]  Luca Benini,et al.  Dynamic Power Management for Nonstationary Service Requests , 2002, IEEE Trans. Computers.

[32]  Tadashi Dohi,et al.  Estimating Software Rejuvenation Schedules in High-Assurance Systems , 2001, Comput. J..

[33]  Kishor S. Trivedi,et al.  A methodology for detection and estimation of software aging , 1998, Proceedings Ninth International Symposium on Software Reliability Engineering (Cat. No.98TB100257).

[34]  Radu Marculescu,et al.  Hierarchical adaptive dynamic power management , 2004 .

[35]  Qinru Qiu,et al.  Dynamic power management based on continuous-time Markov decision processes , 1999, Proceedings - Design Automation Conference.

[36]  Jordi Torres,et al.  Adaptive on-line software aging prediction based on machine learning , 2010, 2010 IEEE/IFIP International Conference on Dependable Systems & Networks (DSN).

[37]  G. Monahan State of the Art—A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms , 1982 .

[38]  Kishor S. Trivedi,et al.  Accelerated Degradation Tests Applied to Software Aging Experiments , 2010, IEEE Transactions on Reliability.

[39]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[40]  Kishor S. Trivedi,et al.  Transient analysis of cumulative measures of markov model behavior , 1989 .

[41]  H. Okamura,et al.  Markovian Arrival Process Parameter Estimation With Group Data , 2009, IEEE/ACM Transactions on Networking.

[42]  Luca Benini,et al.  A survey of design techniques for system-level dynamic power management , 2000, IEEE Trans. Very Large Scale Integr. Syst..

[43]  U. Rieder,et al.  Markov Decision Processes , 2010 .

[44]  Kishor S. Trivedi,et al.  Analysis of software rejuvenation using Markov Regenerative Stochastic Petri Net , 1995, Proceedings of Sixth International Symposium on Software Reliability Engineering. ISSRE'95.