Reliability Optimization in the Design of Distributed Systems

The reliability of a distributed system depends on the reliabilities of its communication links and computing elements, as well as on the distribution of its resources, such as programs and data files. A useful measure of reliability in distributed systems is the terminal reliability between a pair of nodes which is the probability that at least one communication path exists between these nodes. An interesting optimization problem is that of maximizing the terminal reliability between a pair of computing elements under a given budget constraint. Analytical techniques to solve this problem are applicable only to special forms of reliability expressions. In this paper, three iterative algorithms for terminal reliability maximization are presented. The first two algorithms require the computation of terminal reliability expressions, and are therefore efficient for only small networks. The third algorithm, which is developed for large distributed systems, does not require the computation of terminal reliability expressions; this algorithm maximizes approximate objective functions and gives accurate results. Several examples are presented to illustrate the approximate optimization algorithm and an estimation of the error involved is also given.

[1]  Salim Hariri,et al.  SYREL: A Symbolic Reliability Algorithm Based on Path and Cutset Methods , 1987, IEEE Transactions on Computers.

[2]  J. Sharma,et al.  A Direct Method for Maximizing the System Reliability , 1971 .

[3]  Krishna Gopal AnImproved Algorithm forReliability Optimization , 1978 .

[4]  U. Montanari,et al.  A Boolean algebra method for computing the terminal reliability in a communication network , 1973 .

[5]  L.C. Frair,et al.  Optimal Reliability Design for Complex Systems , 1981, IEEE Transactions on Reliability.

[6]  B. J. Leon,et al.  A New Algorithm for Symbolic System Reliability Analysis , 1976, IEEE Transactions on Reliability.

[7]  Gene Hilborn Measures for distributed processing network survivability , 1980, AFIPS '80.

[8]  S. Arnborg Reduced State EnumerationߞAnother Algorithm for Reliability Evaluation , 1978, IEEE Transactions on Reliability.

[9]  F. A. Tillman,et al.  Optimal Reliability of a Complex System , 1970 .

[10]  C. Hwang,et al.  Optimization Techniques for System Reliability with RedundancyߞA Review , 1977, IEEE Transactions on Reliability.

[11]  Michael O. Ball,et al.  Complexity of network reliability computations , 1980, Networks.

[12]  Yuji Nakagawa,et al.  A Heuristic Method for Determining Optimal Reliability Allocation , 1977, IEEE Transactions on Reliability.

[13]  S. Rai,et al.  An Efficient Method for Reliability Evaluation of a General Network , 1978, IEEE Transactions on Reliability.

[14]  Richard E. Merwin,et al.  Derivation and use of a survivability criterion for DDP systems , 1980, AFIPS '80.

[15]  Mario Gerla,et al.  A new algorithm for symbolic reliability analysis of computer - Communication networks , 1980 .

[16]  Way Kuo,et al.  Determining Component Reliability and Redundancy for Optimum System Reliability , 1977, IEEE Transactions on Reliability.

[17]  Amit P. Sheth,et al.  An Analysis Of The Effect Of Network Load And Topology On The Performance Of A Concurrency Control Aigorithm In Distributed Database Systems , 1984 .

[18]  K. K. Aggarwal,et al.  An Improved Algorithm for Reliability Optimization , 1978, IEEE Transactions on Reliability.

[19]  Martin Messinger,et al.  Techniques for Optimum Spares Allocation: A Tutorial Review , 1970 .

[20]  Ching-Lai Hwang,et al.  Optimization Techniques forSystem Reliability withRedundancy-A Review , 1977 .

[21]  Luigi Fratta,et al.  Synthesis of Available Networks , 1976, IEEE Transactions on Reliability.

[22]  C. Hwang,et al.  Reliability Optimization by Generalized Lagrangian-Function and Reduced-Gradient Methods , 1979, IEEE Transactions on Reliability.

[23]  K. B. Misra,et al.  A Method of Solving Redundancy Optimization Problems , 1971 .