Scalable resource scheduling: design, assessment, prototyping

Resource scheduling in distributed systems aims at achieving maximal system performance by utilizing the available system resources efficiently. Large distributed systems, comprising hundreds or thousands of nodes and spanning vast geographical distances (e.g. Internet), require resource scheduling to be scalable. Scalability has become a common requirement in the design and development of distributed software. This paper describes a comprehensive approach to software development, leading from the stage of requirements specification, through design and algorithm assessment to a prototype implementation of a scalable resource scheduling policy. Scalability is achieved by system partitioning. Communication delays may limit scalability and degrade system performance. In this work, delays are handled to improve the performance of a scheduling policy. The paper demonstrates performance results obtained in simulation under communication and computation overload conditions. The simulation code is later used for prototype implementation. Finally, we examine the software design issues and applicability of the prototype to different distributed environments, providing the example of PVM.

[1]  Mukesh Singhal,et al.  Guest Editor's Introduction: Distributed Computing Systems , 1991 .

[2]  Amnon Barak,et al.  The MOSIX Distributed Operating System: Load Balancing for UNIX , 1993 .

[3]  Jeff Kramer,et al.  Methodical Analysis of Adaptive Load Sharing Algorithms , 1992, IEEE Trans. Parallel Distributed Syst..

[4]  J. Howard Et El,et al.  Scale and performance in a distributed file system , 1988 .

[5]  Kevin P. Twidle,et al.  Constructing distributed Unix utilities in Regis , 1994, Proceedings of 2nd International Workshop on Configurable Distributed Systems.

[6]  M. Pernice,et al.  PVM: Parallel Virtual Machine - A User's Guide and Tutorial for Networked Parallel Computing [Book Review] , 1996, IEEE Parallel & Distributed Technology: Systems & Applications.

[7]  Thomas Kunz,et al.  The Influence of Different Workload Descriptions on a Heuristic Load Balancing Scheme , 1991, IEEE Trans. Software Eng..

[8]  Franklin A. Graybill,et al.  Introduction to The theory , 1974 .

[9]  Michael Stumm,et al.  The design and implementation of a decentralized scheduling facility for a workstation cluster , 1988, [1988] Proceedings. 2nd IEEE Conference on Computer Workstations.

[10]  Songnian Zhou A Trace-Driven Simulation Study of Dynamic Load Balancing , 1988, IEEE Trans. Software Eng..

[11]  Mahadev Satyanarayanan,et al.  Scale and performance in a distributed file system , 1988, TOCS.

[12]  S. Zhou,et al.  A Trace-Driven Simulation Study of Dynamic Load Balancing , 1987, IEEE Trans. Software Eng..

[13]  Amnon Barak,et al.  The MOSIX Distributed Operating System , 1993, Lecture Notes in Computer Science.

[14]  Amnon Barak,et al.  Design Principles of Operating Systems for Large Scale Multicomputers , 1987, Experiences with Distributed Systems.

[15]  Jeff Magee,et al.  Scalable, adaptive load sharing for distributed systems , 1993, IEEE Parallel & Distributed Technology: Systems & Applications.

[16]  Phillip Krueger,et al.  Adaptive Location Policies for Global Scheduling , 1994, IEEE Trans. Software Eng..

[17]  Raphael A. Finkel,et al.  Designing a process migration facility: the Charlotte experience , 1989, Computer.

[18]  Mahadev Satyanarayanan,et al.  The Influence of Scale on Distributed File System Design , 1992, IEEE Trans. Software Eng..