Availability Modeling of SIP Protocol on IBM

We present the availability model of a high availability SIP Application Server configuration on WebSphere. Hardware, operating system and application server failures are considered. Different types of fault detectors, detection delays, failover delays, restarts, reboots and repairs are considered. Imperfect coverages for detection, failover and recovery are incorporated. Computations are based on a set of interacting sub-models of all system components capturing their failure and recovery behavior. The parameter values used in the calculations are based on several sources, including field data, high availability testing, and agreedupon assumptions. In cases where a parameter value is uncertain, due to assumptions or limited test data, a sensitivity analysis of that parameter has been provided. Our analysis indicates the failure types and recovery parameters that are most critical in their impact on overall system availability. These results will help guide system improvement efforts throughout future releases of these products.

[1]  Kishor S. Trivedi,et al.  Performance and reliability evaluation of passive replication schemes in application level fault tolerance , 1999, Digest of Papers. Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing (Cat. No.99CB36352).

[2]  Veena B. Mendiratta Reliability analysis of clustered computing systems , 1998, Proceedings Ninth International Symposium on Software Reliability Engineering (Cat. No.98TB100257).

[3]  Boudewijn R. Haverkort,et al.  Performance and reliability analysis of computer systems: An example-based approach using the sharpe software package , 1998 .

[4]  Kishor S. Trivedi,et al.  Availability analysis of blade server systems , 2008, IBM Syst. J..

[5]  Kishor S. Trivedi,et al.  Modeling and analysis of software rejuvenation in cable modem termination systems , 2002, 13th International Symposium on Software Reliability Engineering, 2002. Proceedings..

[6]  Kishor S. Trivedi,et al.  Fighting bugs: remove, retry, replicate, and rejuvenate , 2007, Computer.