A model of shared dasd and multipathing

This paper presents a model of an I/O subsystem in which devices can be accessed from multiple CPUs and/or via alternative channel and control unit paths. The model estimates access response times, given access rates for all CPU-device combinations. The systems treated are those having the IBM System/370 architecture, with each path consisting of a CPU, channel, control unit, head of string, and device with rotational position sensing. The path selected for an access at seek initiation time remains in effect for the entire channel program. The computation proceeds in three stages: First, the feasibility of the prescribed access rates is determined by solving a linear programming problem. Second, the splitting of access rates among the available paths is determined so as to satisfy the following principle: The probability of selecting a given path is proportional to the probability that the path is free. This condition leads to a set of nonlinear equations, which can be solved by means of the Newton-Raphson method. Third, the RPS hit probability, i.e. the probability that the path is free when the device is ready to transmit, is computed in the following manner: From the point of view of the selected path, the system may be viewed as being in one of 25 possible states. There are twelve different subsets of states whose aggregate probabilities can be computed from the (by now) known flow rates over the various paths. The maximum entropy principle is used to calculate the unknown state probabilities, with the known aggregate probabilities acting as constraints. The required RPS hit probability can be computed easily once the state probabilities have been determined. Explicit formulas are given for all these quantities. Empirically derived formulas are used to compute the RPS miss probability on subsequent revolutions, given the probability on the first revolution. The model is validated against a simulator, showing excellent agreement for systems with path utilizations up to 50 percent. The model is also validated against measurements from a real three-CPU system with 31 shared devices. In this validation, the I/O subsystem model acts as a common submodel to three copies of a system model, one for each CPU. Estimated end-user transaction response times show excellent agreement with the live measurements.