A generalized algorithm for evaluating distributed-program reliability

A one-step algorithm, GEAR (generalized evaluation algorithm for reliability), is introduced that computes the reliability of a distributed computing system (DCS), which usually consists of processing element, memory unit, input/output devices, data-files, and programs as its shared resources. The probability that a task or an application can be computed successfully by sharing the required resources on the DCS is termed the system reliability. Some of the important reliabilities defined using the above concept are discussed, including terminal-pair, computer-network, distributed-program, and distributed-system. GEAR is general enough to compute all four of these parameters, and does not require any prior knowledge about multiterminal connections for computing reliability expression. Many examples are included to illustrate the usefulness of GEAR for computing reliability measures of a DCS. >

[1]  Viktor K. Prasanna,et al.  Reliability Analysis in Distributed Systems , 1988, IEEE Trans. Computers.

[2]  Richard A. DeMillo,et al.  Operational survivability in gracefully degrading distributed processing systems , 1986, IEEE Transactions on Software Engineering.

[3]  A. Satyanarayana,et al.  A New Algorithm for the Reliability Analysis of Multi-Terminal Networks , 1981, IEEE Transactions on Reliability.

[4]  B. J. Leon,et al.  A New Algorithm for Symbolic System Reliability Analysis , 1976, IEEE Transactions on Reliability.

[5]  Dharma P. Agrawal,et al.  On computer communication network reliability under program execution constraints , 1988, IEEE J. Sel. Areas Commun..

[6]  Richard E. Merwin,et al.  Derivation and use of a survivability criterion for DDP systems , 1980, AFIPS '80.

[7]  O. Wing,et al.  Analysis of Probabilistic Networks , 1964 .

[8]  K. K. Aggarwal,et al.  A Simple Method for Reliability Evaluation of a Communication System , 1975, IEEE Trans. Commun..

[9]  R. Bennetts On the Analysis of Fault Trees , 1975, IEEE Transactions on Reliability.

[10]  Suresh Rai,et al.  Reliability evaluation algorithms for distributed systems , 1988, IEEE INFOCOM '88,Seventh Annual Joint Conference of the IEEE Computer and Communcations Societies. Networks: Evolution or Revolution?.

[11]  K.B. Misra,et al.  A Fast Algorithm for Reliability Evaluation , 1975, IEEE Transactions on Reliability.

[12]  Suresh Rai,et al.  Reliability Evaluation in Computer-Communication Networks , 1981, IEEE Transactions on Reliability.