Scalable load-sharing for distributed systems

Adaptive algorithms for load-sharing usually comprise two basic functions: state information dissemination and decision making. The authors describe a flexible load-sharing algorithm, FLS, which includes a third function introduced for scalability purposes, that of partitioning into domains. The system partitioning function at a node is responsible for the selection of other nodes to be included in its domain. The state of other nodes in its domain is held locally, in a cache. Cached data are treated as hints for decision making. The FLS algorithm permits local decisions to be made, aims at minimizing the number of incorrect decisions, and does not allow erroneous decisions to proceed. The algorithm is analyzed and shown to be stable and scalable. Its suitability to a CONIC/RES environment was demonstrated with a prototype implementation, providing an automatic software allocation service as part of configuration management.<<ETX>>

[1]  Marvin Theimer,et al.  Finding Idle Machines in a Workstation-Based Distributed System , 1989, IEEE Trans. Software Eng..

[2]  Jeff Kramer,et al.  Methodical Analysis of Adaptive Load Sharing Algorithms , 1992, IEEE Trans. Parallel Distributed Syst..

[3]  Arif Ghafoor,et al.  Semi-Distributed Load Balancing For Massively Parallel Multicomputer Systems , 1991, IEEE Trans. Software Eng..

[4]  Thomas L. Casavant,et al.  Effects of Response and Stability on Scheduling in Distributed Computing Systems , 1988, IEEE Trans. Software Eng..

[5]  Morris Sloman,et al.  Constructing Distributed Systems in Conic , 1989, IEEE Trans. Software Eng..

[6]  Rafael Alonso,et al.  Sharing jobs among independently owned processors , 1988, [1988] Proceedings. The 8th International Conference on Distributed.

[7]  John A. Stankovic,et al.  Simulations of Three Adaptive, Decentralized Controlled, Job Scheduling Algorithms , 1984, Comput. Networks.

[8]  Jeff Magee,et al.  Rapid assessment of decentralized algorithms , 1990, COMPEURO'90: Proceedings of the 1990 IEEE International Conference on Computer Systems and Software Engineering@m_Systems Engineering Aspects of Complex Computerized Systems.

[9]  Donald F. Towsley,et al.  Analysis of the Effects of Delays on Load Sharing , 1989, IEEE Trans. Computers.

[10]  Kenneth P. Birman,et al.  Exploiting virtual synchrony in distributed systems , 1987, SOSP '87.

[11]  Thomas L. Casavant,et al.  Analysis of Three Dynamic Distributed Load-Balancing Strategies with Varying Global Information Requirements , 1987, ICDCS.

[12]  Amnon Barak,et al.  Design Principles of Operating Systems for Large Scale Multicomputers , 1987, Experiences with Distributed Systems.

[13]  Songnian Zhou A Trace-Driven Simulation Study of Dynamic Load Balancing , 1988, IEEE Trans. Software Eng..

[14]  John A. Stankovic Stability and Distributed Scheduling Algorithms , 1985, IEEE Trans. Software Eng..

[15]  Miron Livny,et al.  Load balancing in homogeneous broadcast distributed systems , 1982, SIGMETRICS 1982.

[16]  Douglas B. Terry Caching Hints in Distributed Systems , 1987, IEEE Transactions on Software Engineering.

[17]  Yung-Terng Wang,et al.  Load Sharing in Distributed Systems , 1985, IEEE Transactions on Computers.

[18]  Kemal Efe,et al.  Minimizing control overheads in adaptive load sharing , 1989, [1989] Proceedings. The 9th International Conference on Distributed Computing Systems.

[19]  Edward D. Lazowska,et al.  Adaptive load sharing in homogeneous distributed systems , 1986, IEEE Transactions on Software Engineering.

[20]  Amnon Barak,et al.  A distributed load‐balancing policy for a multicomputer , 1985, Softw. Pract. Exp..

[21]  Jeff Kramer Configuration programming-a framework for the development of distributable systems , 1990, COMPEURO'90: Proceedings of the 1990 IEEE International Conference on Computer Systems and Software Engineering@m_Systems Engineering Aspects of Complex Computerized Systems.