A fast algorithm for reliability-oriented task assignment in a distributed system

Distributed systems (DS) have become a major trend in computer systems design today because of their high speed and high reliability. Reliability is an important performance parameter in DS design. The distribution of programs and data files can affect the system reliability. Usually, designers add redundant copies of software and/or hardware to increase the system's reliability. The reliability-oriented task assignment problem, which is NP-hard, is to find a task distribution such that the program reliability or system reliability is maximized. In this paper, we developed a reliability-oriented task allocation scheme, based on a heuristic algorithm, for DS to find an approximate solution. The simulation shows that, in most test cases with one copy, the algorithm finds suboptimal solutions efficiently. When the algorithm cannot obtain an optimal solution, the deviation is very small; therefore, this is a desirable approach for solving these problems.

[1]  Ishfaq Ahmad,et al.  Optimal task assignment in heterogeneous distributed computing systems , 1998, IEEE Concurr..

[2]  A. Satyanarayana,et al.  A New Algorithm for the Reliability Analysis of Multi-Terminal Networks , 1981, IEEE Transactions on Reliability.

[3]  Salim Hariri,et al.  SYREL: A Symbolic Reliability Algorithm Based on Path and Cutset Methods , 1987, IEEE Transactions on Computers.

[4]  Dharma P. Agrawal,et al.  A generalized algorithm for evaluating distributed-program reliability , 1993 .

[5]  Suresh Rai,et al.  Reliability Evaluation in Computer-Communication Networks , 1981, IEEE Transactions on Reliability.

[6]  C. Siva Ram Murthy,et al.  Algorithms for reliability-oriented module allocation in distributed computing systems , 1998, J. Syst. Softw..

[7]  Deng-Jyi Chen,et al.  Reliability Analysis of Distributed Systems Based on a Fast Reliability Algorithm , 1992, IEEE Trans. Parallel Distributed Syst..

[8]  David W. Coit,et al.  Reliability optimization of series-parallel systems using a genetic algorithm , 1996, IEEE Trans. Reliab..

[9]  D. Torrieri,et al.  Calculation of node-pair reliability in large networks with unreliable nodes , 1994 .

[10]  Anup Kumar,et al.  Genetic algorithm based approach for file allocation on distributed systems , 1995, Comput. Oper. Res..

[11]  Alice E. Smith,et al.  Reliability optimization of computer communication networks using genetic algorithms , 1998, SMC'98 Conference Proceedings. 1998 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.98CH36218).

[12]  Ruey-Shun Chen,et al.  A heuristic approach to generating file spanning trees for reliability analysis of distributed computing systems , 1997 .

[13]  Gwo-Jen Hwang,et al.  A heuristic task assignment algorithm to maximize reliability of a distributed system , 1993 .

[14]  K. K. Aggarwal,et al.  Topological layout of links for optimizing the s-t reliability in a computer communication system , 1982 .

[15]  Viktor K. Prasanna,et al.  Distributed program reliability analysis , 1986, IEEE Transactions on Software Engineering.