System availability assessment using stochastic models

Availability assessment is of paramount importance to guarantee uninterrupted operation of a variety of commercial-grade information and networked systems. In this paper, we present practical case studies that show how to use stochastic analytic modeling approaches to quantitatively assess the availability of such systems. We present non-state-space models, statespace models and hierarchical models. We describe the details of these modeling approaches to assess system availability. Copyright (c) 2012 John Wiley & Sons, Ltd.

[1]  Cristina Nita-Rotaru,et al.  A survey of attack and defense techniques for reputation systems , 2009, CSUR.

[2]  Kishor S. Trivedi,et al.  Availability analysis of blade server systems , 2008, IBM Syst. J..

[3]  Kishor S. Trivedi,et al.  Injecting Memory Leaks to Accelerate Software Failures , 2011, 2011 IEEE 22nd International Symposium on Software Reliability Engineering.

[4]  Miroslaw Malek,et al.  A survey of online failure prediction methods , 2010, CSUR.

[5]  Liang Yin,et al.  Hierarchical composition and aggregation of state-based availability and performability models , 2003, IEEE Trans. Reliab..

[6]  Kishor S. Trivedi,et al.  Markov and Markov reward model transient analysis: An overview of numerical approaches , 1989 .

[7]  Kishor S. Trivedi,et al.  Loss formulas and their application to optimization for cellular networks , 2001, IEEE Trans. Veh. Technol..

[8]  Kishor S. Trivedi,et al.  A scalable availability model for Infrastructure-as-a-Service cloud , 2011, 2011 IEEE/IFIP 41st International Conference on Dependable Systems & Networks (DSN).

[9]  Chita R. Das,et al.  An Availability Model for MIN-Based Multiprocessors , 1993, IEEE Trans. Parallel Distributed Syst..

[10]  William H. Sanders,et al.  Model-based evaluation: from dependability to security , 2004, IEEE Transactions on Dependable and Secure Computing.

[11]  Kishor S. Trivedi,et al.  An empirical investigation of fault types in space mission system software , 2010, 2010 IEEE/IFIP International Conference on Dependable Systems & Networks (DSN).

[12]  Kishor S. Trivedi,et al.  An Aggregation Technique for the Transient Analysis of Stiff Markov Chains , 1986, IEEE Transactions on Computers.

[13]  Tadashi Dohi,et al.  Estimating Software Rejuvenation Schedules in High-Assurance Systems , 2001, Comput. J..

[14]  Miroslaw Malek Online Dependability Assessment through Runtime Monitoring and Prediction , 2008, 2008 Seventh European Dependable Computing Conference.

[15]  Kishor S. Trivedi,et al.  A BDD-Based Algorithm for Analysis of Multistate Systems with Multistate Components , 2003, IEEE Trans. Computers.

[16]  Kishor S. Trivedi,et al.  A comprehensive model for software rejuvenation , 2005, IEEE Transactions on Dependable and Secure Computing.

[17]  Wei Xie,et al.  Analysis of a two-level software rejuvenation policy , 2005, Reliab. Eng. Syst. Saf..

[18]  Kishor S. Trivedi,et al.  Performance and Reliability Analysis of Computer Systems , 1996, Springer US.

[19]  Kishor S. Trivedi,et al.  Availability Modeling of SIP Protocol on IBM© WebSphere© , 2008, 2008 14th IEEE Pacific Rim International Symposium on Dependable Computing.

[20]  Kishor S. Trivedi,et al.  SHARPE at the age of twenty two , 2009, PERV.

[21]  Lluís A. Belanche Muñoz,et al.  Predicting Software Anomalies Using Machine Learning Techniques , 2011, 2011 IEEE 10th International Symposium on Network Computing and Applications.

[22]  Dong Chen,et al.  Reliability and availability analysis for the JPL Remote Exploration and Experimentation System , 2002, Proceedings International Conference on Dependable Systems and Networks.

[23]  Kishor S. Trivedi,et al.  Computing Cumulative Measures of Stiff Markov Chains Using Aggregation , 1990, IEEE Trans. Computers.

[24]  Kishor S. Trivedi,et al.  Dependability modeling using Petri-nets , 1995 .

[25]  Miroslaw Malek,et al.  Quantifying Criticality of Dependability-Related IT Organization Processes in CobiT , 2009, 2009 15th IEEE Pacific Rim International Symposium on Dependable Computing.

[26]  Kishor S. Trivedi,et al.  Availability Monitor for a Software Based System , 2007 .

[27]  Kishor S. Trivedi,et al.  Analysis of periodic preventive maintenance with general system failure distribution , 2001, Proceedings 2001 Pacific Rim International Symposium on Dependable Computing.

[28]  S. E. Hon,et al.  Reliability and quality measurements for telecommunications systems (RQMS)-a new direction , 1991, ICC 91 International Conference on Communications Conference Record.

[29]  Kishor S. Trivedi,et al.  Power-hierarchy of dependability-model types , 1994 .

[30]  Kishor S. Trivedi,et al.  A Best Practice Guide to Resource Forecasting for Computing Systems , 2007, IEEE Transactions on Reliability.

[31]  Kishor S. Trivedi,et al.  Performability Analysis: Measures, an Algorithm, and a Case Study , 1988, IEEE Trans. Computers.

[32]  Kishor S. Trivedi,et al.  Optimization for condition-based maintenance with semi-Markov decision process , 2005, Reliab. Eng. Syst. Saf..

[33]  Kishor S. Trivedi,et al.  Sufficient Conditions for Existence of a Fixed Point in Stochastic Reward Net-Based Iterative Models , 1996, IEEE Trans. Software Eng..

[34]  Kishor S. Trivedi,et al.  Application of semi-Markov process and CTMC to evaluation of UPS system availability , 2002, Annual Reliability and Maintainability Symposium. 2002 Proceedings (Cat. No.02CH37318).

[35]  Domenico Cotroneo,et al.  Improving Log-based Field Failure Data Analysis of multi-node computing systems , 2011, 2011 IEEE/IFIP 41st International Conference on Dependable Systems & Networks (DSN).

[36]  Kishor S. Trivedi,et al.  Modeling Correlation in Software Recovery Blocks , 1993, IEEE Trans. Software Eng..

[37]  Kishor S. Trivedi,et al.  Modeling High Availability , 2006, 2006 12th Pacific Rim International Symposium on Dependable Computing (PRDC'06).

[38]  Kishor S. Trivedi,et al.  Performance analysis of distributed real-time databases , 1998, Proceedings. IEEE International Computer Performance and Dependability Symposium. IPDS'98 (Cat. No.98TB100248).

[39]  Kishor S. Trivedi,et al.  Model Based Approach for Autonomic Availability Management , 2006, ISAS.

[40]  Kishor S. Trivedi,et al.  Analysis of inspection-based preventive maintenance in operational software systems , 2002, 21st IEEE Symposium on Reliable Distributed Systems, 2002. Proceedings..

[41]  Kishor S. Trivedi,et al.  Analysis of Software Aging in a Web Server , 2006, IEEE Transactions on Reliability.

[42]  Kishor S. Trivedi,et al.  A workload-based analysis of software aging, and rejuvenation , 2005, IEEE Transactions on Reliability.

[43]  Veena B. Mendiratta Reliability analysis of clustered computing systems , 1998, Proceedings Ninth International Symposium on Software Reliability Engineering (Cat. No.98TB100257).

[44]  Deep Medhi,et al.  Dependability and security models , 2009, 2009 7th International Workshop on Design of Reliable Communication Networks.

[45]  Dong Seong Kim,et al.  Multi-State Availability Modeling in Practice , 2012 .

[46]  Jordi Torres,et al.  Adaptive on-line software aging prediction based on machine learning , 2010, 2010 IEEE/IFIP International Conference on Dependable Systems & Networks (DSN).