Optimizing software rejuvenation policy for tasks with periodic inspections and time limitation

Abstract Software aging has been observed in diverse types of software systems, causing gradual performance degradation with time and/or load and eventually system failures. To mitigate the aging effects and prevent serious losses caused by the system failure, software rejuvenations can be proactively performed to restore the system performance. This paper models and optimizes a state-based rejuvenation policy for software systems performing real-time computing tasks and undergoing periodic inspections. During each scheduled inspection, the system state is evaluated and the decision about the rejuvenation is made based on the evaluated system state and a rejuvenation decision function. The time of each rejuvenation procedure (corresponding to the system downtime) depends on the system state as well as on the amount of task operations accomplished before deciding to perform the rejuvenation. As the rejuvenation policy determines the time and number of rejuvenations performed during the task processing, it can affect the probability that the system can accomplish the real-time task by a certain deadline significantly. In this work, we optimize the state-based rejuvenation policy to maximize the probability of task completion (PTC) of periodically inspected software systems. The methodology encompasses an event transition-based iterative method proposed for quantifying the PTC and application of the Genetic Algorithm for deriving the optimal rejuvenation policy. Examples are presented to demonstrate the proposed methodology and influences of several parameters (e.g., inspection interval, rejuvenation time) on the optimization results.

[1]  Agapios N. Platis,et al.  Applying Partial and Full Rejuvenation in Different Degradation Levels , 2011, 2011 IEEE Third International Workshop on Software Aging and Rejuvenation.

[2]  A. T. Tai,et al.  On-board preventive maintenance: analysis of effectiveness and optimal duty period , 1997, Proceedings Third International Workshop on Object-Oriented Real-Time Dependable Systems.

[3]  Kishor S. Trivedi,et al.  Proactive management of software aging , 2001, IBM J. Res. Dev..

[4]  Guozhi Xu,et al.  Time and Prediction based Software Rejuvenation Policy , 2010, 2010 Second International Conference on Information Technology and Computer Science.

[5]  Chandra M. R. Kintala Software Rejuvenation in Embedded Systems , 2009, J. Autom. Lang. Comb..

[6]  Naoto Miyoshi,et al.  Analysis of an optimal stopping problem for software rejuvenation in a deteriorating job processing system , 2017, Reliab. Eng. Syst. Saf..

[7]  Nikolaos Limnios,et al.  Availability and reliability estimation for a system undergoing minimal, perfect and failed rejuvenation , 2008, 2008 IEEE International Conference on Software Reliability Engineering Workshops (ISSRE Wksp).

[8]  Gregory Levitin,et al.  Cold vs. hot standby mission operation cost minimization for 1-out-of-N systems , 2014, Eur. J. Oper. Res..

[9]  Alberto Ferrer,et al.  On-The-Fly Processing of continuous high-dimensional data streams , 2017 .

[10]  Matteo Sereno,et al.  Fine Grained Software Degradation Models for Optimal Rejuvenation Policies , 2001, Perform. Evaluation.

[11]  Elaine J. Weyuker,et al.  Monitoring Smoothly Degrading Systems for Increased Dependability , 2004, Empirical Software Engineering.

[12]  Liudong Xing,et al.  Heterogeneous 1-out-of-N warm standby systems with online checkpointing , 2018, Reliab. Eng. Syst. Saf..

[13]  Kishor S. Trivedi,et al.  An approach for estimation of software aging in a Web server , 2002, Proceedings International Symposium on Empirical Software Engineering.

[14]  E Marshall,et al.  Fatal error: how patriot overlooked a scud. , 1992, Science.

[15]  Dong Seong Kim,et al.  Modeling and analysis of software rejuvenation in a server virtualized system with live VM migration , 2013, Perform. Evaluation.

[16]  Kishor S. Trivedi,et al.  A workload-based analysis of software aging, and rejuvenation , 2005, IEEE Transactions on Reliability.

[17]  Hei Xinhong,et al.  Analytical Modeling of Periodically Inspected Software Rejuvenation Policy , 2013 .

[18]  Domenico Cotroneo,et al.  A survey of software aging and rejuvenation studies , 2014, ACM J. Emerg. Technol. Comput. Syst..

[19]  Tadashi Dohi,et al.  A Statistical Framework on Software Aging Modeling with Continuous-Time Hidden Markov Model , 2017, 2017 IEEE 36th Symposium on Reliable Distributed Systems (SRDS).

[20]  Kishor S. Trivedi,et al.  Software Rejuvenation - Modeling and Analysis , 2004, IFIP Congress Tutorials.

[21]  Stefano Russo,et al.  Software Aging and Rejuvenation in the Cloud: A Literature Review , 2018, 2018 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW).

[22]  Liudong Xing,et al.  Cost minimization of real-time mission for software systems with rejuvenation , 2020, Reliab. Eng. Syst. Saf..

[23]  Jianhua Ma,et al.  Simulation-Based Optimization Approach for Software Cost Model with Rejuvenation , 2008, ATC.

[24]  Tadashi Dohi,et al.  Toward high assurance software systems with adaptive fault management , 2014, Software Quality Journal.

[25]  Agapios N. Platis,et al.  Modeling Software Rejuvenation on a Redundant System Using Monte Carlo Simulation , 2012, 2012 IEEE 23rd International Symposium on Software Reliability Engineering Workshops.

[26]  Gregory Levitin Genetic algorithms in reliability engineering , 2006, Reliab. Eng. Syst. Saf..

[27]  Tadashi Dohi,et al.  Optimal periodic software rejuvenation policies based on interval reliability criteria , 2018, Reliab. Eng. Syst. Saf..

[28]  Gregory Levitin,et al.  Joint optimal checkpointing and rejuvenation policy for real-time computing tasks , 2019, Reliab. Eng. Syst. Saf..

[29]  Srividya Kona Bansal,et al.  Towards a Semantic Extract-Transform-Load (ETL) Framework for Big Data Integration , 2014, 2014 IEEE International Congress on Big Data.

[30]  Wei Xie,et al.  Analysis of a two-level software rejuvenation policy , 2005, Reliab. Eng. Syst. Saf..

[31]  William Yurcik,et al.  Achieving Fault-Tolerant Software with Rejuvenation and Reconfiguration , 2001, IEEE Softw..

[32]  Paulo Maciel,et al.  SWARE: An approach to support software aging and rejuvenation experiments , 2017 .

[33]  Kai-Yuan Cai,et al.  A comprehensive approach to optimal software rejuvenation , 2013, Perform. Evaluation.

[34]  Gregory Levitin,et al.  Optimizing software rejuvenation policy for real time tasks , 2018, Reliab. Eng. Syst. Saf..

[35]  Tadashi Dohi,et al.  Discrete-time cost analysis for a telecommunication billing application with rejuvenation , 2006, Comput. Math. Appl..

[36]  Kishor S. Trivedi,et al.  Analysis of Preventive Maintenance in Transactions Based Software Systems , 1998, IEEE Trans. Computers.

[37]  Kishor S. Trivedi,et al.  The fundamentals of software aging , 2008, 2008 IEEE International Conference on Software Reliability Engineering Workshops (ISSRE Wksp).

[38]  Wei Xie,et al.  Performability analysis of clustered systems with rejuvenation under varying workload , 2007, Perform. Evaluation.

[39]  Tadashi Dohi,et al.  DYNAMIC SOFTWARE AVAILABILITY MODEL WITH REJUVENATION , 2016 .

[40]  Weiyue Li,et al.  Software Rejuvenation Strategy Based on Components , 2010, 2010 Second World Congress on Software Engineering.

[41]  Jianchao Zeng,et al.  Software System Rejuvenation Modeling Based on Sequential Inspection Periods and State Multi-control Limits , 2017, ICPCSEE.

[42]  Hao Wu,et al.  Schedulability Analysis for Real-Time Task Set on Resource with Performance Degradation and Dual-Level Periodic Rejuvenations , 2017, IEEE Transactions on Computers.

[43]  Hong-Zhong Huang,et al.  Optimization of partial software rejuvenation policy , 2019, Reliab. Eng. Syst. Saf..

[44]  John F. Meyer,et al.  On Evaluating the Performability of Degradable Computing Systems , 1980, IEEE Transactions on Computers.

[45]  Kishor S. Trivedi,et al.  A comprehensive model for software rejuvenation , 2005, IEEE Transactions on Dependable and Secure Computing.