A semi distributed task allocation strategy for large hypercube supercomputers

The authors present a semi-distributed approach for task scheduling in large parallel and distributed systems which is different from the conventional centralized and fully distributed approaches. The proposed strategy partitions the system into independent regions (spheres) centered at some control points. The central points, called schedulers optimally schedule tasks within their spheres and maintain state information with low overhead. The authors consider hypercube systems for evaluation and, using their algebraic characteristics, show that identification of spheres and their scheduling points is an NP-complete problem. The performance of the proposed strategy was evaluated and compared with an efficient fully distributed strategy. In addition to yielding high performance in terms of response time, better resource utilization, and throughput, the proposed strategy is shown to incur small overhead in terms of network traffic.<<ETX>>

[1]  Robert M. Keller,et al.  Gradient Model: A Demand-Driven Load Balancing Scheme , 1986, ICDCS.

[2]  R. Daniel Bergeron,et al.  Task Allocation Heuristics for Distributed Computing Systems , 1986, ICDCS.

[3]  Shahid H. Bokhari,et al.  Dual Processor Scheduling with Dynamic Reassignment , 1979, IEEE Transactions on Software Engineering.

[4]  A. McLoughlin,et al.  The complexity of computing the covering radius of a code , 1984, IEEE Trans. Inf. Theory.

[5]  Kang G. Shin,et al.  Load Sharing in Distributed Real-Time Systems with State-Change Broadcasts , 1989, IEEE Trans. Computers.

[6]  Arif Ghafoor,et al.  An efficient communication structure for distributed commit protocols , 1989, IEEE J. Sel. Areas Commun..

[7]  Kemal Efe,et al.  Heuristic Models of Task Assignment Scheduling in Distributed Systems , 1982, Computer.

[8]  Arif Ghafoor,et al.  Distance-Transitive Graphs for Fault-Tolerant Multiprocessor Systems , 1989, ICPP.

[9]  F. MacWilliams,et al.  The Theory of Error-Correcting Codes , 1977 .

[10]  John A. Stankovic,et al.  Simulations of Three Adaptive, Decentralized Controlled, Job Scheduling Algorithms , 1984, Comput. Networks.

[11]  Donald F. Towsley,et al.  Analysis of the Effects of Delays on Load Sharing , 1989, IEEE Trans. Computers.

[12]  Larry D. Wittie,et al.  Wave Scheduling - Decentralized Scheduling of Task Forces in Multicomputers , 1984, IEEE Trans. Computers.

[13]  Krithi Ramamritham,et al.  Distributed Scheduling of Tasks with Deadlines and Resource Requirements , 1989, IEEE Trans. Computers.

[14]  Benjamin W. Wah,et al.  A global load balancing strategy for a distributed computer system , 1988, [1988] Proceedings. Workshop on the Future Trends of Distributed Computing Systems in the 1990s.

[15]  Kai Hwang,et al.  Correction to “optimal load balancing in a multiple processor system with many job classes” , 1985, IEEE Transactions on Software Engineering.

[16]  Geoffrey C. Fox,et al.  Matrix algorithms on a hypercube I: Matrix multiplication , 1987, Parallel Comput..

[17]  Edward D. Lazowska,et al.  Adaptive load sharing in homogeneous distributed systems , 1986, IEEE Transactions on Software Engineering.

[18]  Raphael A. Finkel,et al.  A Stable Distributed Scheduling Algorithm , 1981, IEEE International Conference on Distributed Computing Systems.

[19]  Vijay V. Vazirani,et al.  NP-Completeness of Some Generalizations of the Maximum Matching Problem , 1982, Inf. Process. Lett..

[20]  Songnian Zhou Performance Studies of Dynamic Load Balancing in Distributed Systems , 1987 .