2.4.4 Prediction of Information System Availability in Mission Critical and Business Critical Applications

One of the most important attributes of on-line computer systems that are performing mission or business critical applications is availability. System Engineers are often called upon to predict the reliability of such systems as part of proposal preparation, architecture definition, design reviews, and operation. However, traditional modeling techniques are incapable of handling integrated hardware and software systems with multiple states and redundancy. This paper describes how Markov Modeling and Reliability Block Diagrams can be used together to model a large on-line information system and develop the answers to strategic questions on the configuration and operation of high availability computing systems. The analyses are performed using MEADEP, a reliability analysis tool capable of hierarchical modeling and integrating Markov and block diagram techniques. In the example 3-tier architecture e-commerce site described in this paper, it is shown that (a) the most frequently failing subsystem is not necessarily the availability bottleneck, and (b) that restoration time is often a more important parameter than availability when attempting to maximize system throughput.