A heuristic procedure for allocating tasks in fault‐tolerant distributed computer systems

One way of achieving the increased levels of system reliability and availability demanded by critical computer-based control systems is through the use of fault-tolerant distributed computer systems. This article addresses the problem of allocating a set of m tasks among a set of n processors in a manner that will satisfy various task assignment, system capacity, and task scheduling constraints while balancing the workload across processors. We discuss problem background, problem formulation, and a known heuristic procedure for the problem. A new solution-improving heuristic procedure is introduced, and computational experience with the heuristics is presented. With only a modest increase in the amount of computational effort, the new procedure is demonstrated to improve dramatically solution quality as well as obtain near-optimal solutions to the test problems.

[1]  J. Goldberg,et al.  SIFT: Design and analysis of a fault-tolerant computer for aircraft control , 1978, Proceedings of the IEEE.

[2]  Harold S. Stone,et al.  Assignment of Tasks in a Distributed Processor System with Limited Memory , 1979, IEEE Transactions on Computers.

[3]  Masahiro Tsuchiya,et al.  A Task Allocation Model for Distributed Computing Systems , 1982, IEEE Transactions on Computers.

[4]  A. Avizienis,et al.  Fault-tolerance: The survival attribute of digital systems , 1978, Proceedings of the IEEE.

[5]  Wesley W. Chu,et al.  Task Allocation in Distributed Data Processing , 1980, Computer.

[6]  Chung Laung Liu,et al.  Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment , 1989, JACM.

[7]  Gerald L. Thompson,et al.  Benefit-Cost Analysis of Coding Techniques for the Primal Transportation Algorithm , 1973, JACM.

[8]  Wesley W. Chu,et al.  Optimal File Allocation in a Multiple Computer System , 1969, IEEE Transactions on Computers.

[9]  Andrew B. Whinston,et al.  On Optimal Allocation in a Distributed Processing Environment , 1982 .

[10]  Harold S. Stone,et al.  Multiprocessor Scheduling with the Aid of Network Flow Algorithms , 1977, IEEE Transactions on Software Engineering.

[11]  Robert M. Nauss,et al.  An Efficient Algorithm for the 0-1 Knapsack Problem , 1976 .

[12]  Andris A. Zoltners A Direct Descent Binary Knapsack Algorithm , 1978, JACM.

[13]  Gerald L. Thompson,et al.  Accelerated Algorithms for Labeling and Relabeling of Trees, with Applications to Distribution Problems , 1972, JACM.