On Distributed Computing Systems Reliability Analysis Under Program Execution Constraints

Presents an algorithm for computing the reliability of distributed computing systems (DCS). The algorithm, called the Fast Reliability Evaluation Algorithm, is based on the factoring theorem employing several reliability preserving reduction techniques. The effect of file distributions, program distributions, and various topologies on reliability of the DCS is studied in detail using the proposed algorithm. Compared with existing algorithms on various network topologies, file distributions, and program distributions, the proposed algorithm is much more economical in both time and space. To compute the distributed program reliability, the ARPA network is studied to illustrate the feasibility of the proposed algorithm. >

[1]  R. Kevin Wood Factoring Algorithms for Computing K-Terminal Network Reliability , 1986, IEEE Transactions on Reliability.

[2]  Jacob A. Abraham,et al.  Load Redistribution Under Failure in Distributed Systems , 1983, IEEE Transactions on Computers.

[3]  Dharma P. Agrawal,et al.  On computer communication network reliability under program execution constraints , 1988, IEEE J. Sel. Areas Commun..

[4]  Philip H. Enslow What is a "Distributed" Data Processing System? , 1978, Computer.

[5]  Hector Garcia-Molina,et al.  Reliability issues for fully replicated distributed databases , 1982, Computer.

[6]  Salim Hariri,et al.  SYREL: A Symbolic Reliability Algorithm Based on Path and Cutset Methods , 1987, IEEE Transactions on Computers.

[7]  R. Kevin Wood A factoring algorithm using polygon-to-chain reductions for computing K-terminal network reliability , 1985, Networks.

[8]  Butler W. Lampson,et al.  Distributed Systems — Architecture and Implementation , 1982, Lecture Notes in Computer Science.

[9]  Michael O. Ball Computing Network Reliability , 1979, Oper. Res..

[10]  Suresh Rai,et al.  Reliability Evaluation in Computer-Communication Networks , 1981, IEEE Transactions on Reliability.

[11]  Richard E. Merwin,et al.  Derivation and use of a survivability criterion for DDP systems , 1980, AFIPS '80.

[12]  Mark K. Chang,et al.  Network reliability and the factoring theorem , 1983, Networks.

[13]  Fred Moskowitz,et al.  The analysis of redundancy networks , 1958, Transactions of the American Institute of Electrical Engineers, Part I: Communication and Electronics.

[14]  A. Satyanarayana,et al.  A New Algorithm for the Reliability Analysis of Multi-Terminal Networks , 1981, IEEE Transactions on Reliability.

[15]  Viktor K. Prasanna,et al.  Reliability Analysis in Distributed Systems , 1988, IEEE Trans. Computers.

[16]  Suresh Rai,et al.  Reliability evaluation algorithms for distributed systems , 1988, IEEE INFOCOM '88,Seventh Annual Joint Conference of the IEEE Computer and Communcations Societies. Networks: Evolution or Revolution?.

[17]  Viktor K. Prasanna,et al.  Distributed program reliability analysis , 1986, IEEE Transactions on Software Engineering.

[18]  Dharma P. Agrawal,et al.  Advanced computer architecture , 1986 .