Study and comparison of bisectional and hypercube networks for dynamic task reallocation

A class of interconnection network architectures, known as bisectional networks is studied for the purpose of dynamic task allocation in large-scale distributed systems consisting of hundreds or thousands of nodes. A bisectional network is constructed by using the theory of symmetric balanced incomplete block design. It is also isomorphic to folded hypercube in that a binary hypercube network can be easily extended as a bisection network by adding additional links. These additional links add to the network some rich topological properties such as node symmetry, small diameter, small inter-node distance, and partitionability. The important property of partitioning is exploited to investigate a redundant task allocation and a semi-distributed task redistribution strategy under real-time constraints. The same approach is used to partition hypercube systems as well. We study the performance of both networks and show that the bisectional network performs considerably better than the hypercube. The performance evaluation and comparison has been done through simulation by taking into account system load, task migration overhead, task timing constraints, node failure and repair rates, and by examining important measures such as task deadline missing probability, task response time, and probability of task being lost.<<ETX>>

[1]  Vijay V. Vazirani,et al.  NP-Completeness of Some Generalizations of the Maximum Matching Problem , 1982, Inf. Process. Lett..

[2]  Mukesh Singhal,et al.  A transfer policy for global scheduling algorithms to schedule tasks with deadlines , 1991, [1991] Proceedings. 11th International Conference on Distributed Computing Systems.

[3]  Flaviu Cristian,et al.  Understanding fault-tolerant distributed systems , 1991, CACM.

[4]  C. T. Howard Ho An Observation on the Bisectional Interconnection Networks , 1992, IEEE Trans. Computers.

[5]  Arif Ghafoor,et al.  Performance prediction of distributed load balancing on multicomputer systems , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).

[6]  Krithi Ramamritham,et al.  Distributed Scheduling of Tasks with Deadlines and Resource Requirements , 1989, IEEE Trans. Computers.

[7]  Dennis W. Leinbaugh,et al.  Guaranteed response times in a distributed hard-real-time environment , 1986, IEEE Transactions on Software Engineering.

[8]  J. Goldberg,et al.  SIFT: Design and analysis of a fault-tolerant computer for aircraft control , 1978, Proceedings of the IEEE.

[9]  Chita R. Das,et al.  Dependability modeling for multiprocessors , 1990, Computer.

[10]  F. MacWilliams,et al.  The Theory of Error-Correcting Codes , 1977 .

[11]  Arif Ghafoor,et al.  Bisectionla Fault-Tolerant Communication Archtecture for Supercomputer Systems , 1989, IEEE Trans. Computers.

[12]  Jacob A. Abraham,et al.  Load Redistribution Under Failure in Distributed Systems , 1983, IEEE Transactions on Computers.

[13]  Kang G. Shin,et al.  Load Sharing in Distributed Real-Time Systems with State-Change Broadcasts , 1989, IEEE Trans. Computers.

[14]  John A. Stankovic Decentralized Decision Making for Task Reallocation in a Hard Real-Time System , 1989, IEEE Trans. Computers.

[15]  Dennis W. Leinbaugh Guaranteed Response Times in a Hard-Real-Time Environment , 1980, IEEE Transactions on Software Engineering.

[16]  R. M. Kieckhafer,et al.  Fault-tolerant real-time task scheduling in the MAFT distributed system , 1989, [1989] Proceedings of the Twenty-Second Annual Hawaii International Conference on System Sciences. Volume 1: Architecture Track.