Effective queueing strategies for co-scheduling in a pool of processors

We consider a connected set of workstations as a 'pool of processors' and develop a queueing model to analyse the performance of optimal co-scheduling algorithms. The pool of processors model was originally developed for the Amoeba operating system. It was also used in the design of the recent IBM supercomputer model 9076 SPI. Recently, co-scheduling has been suggested as an approach for scheduling computationally intensive tasks in the pool of processors model. Co-scheduling algorithms select the best possible subset of workstations for a task to minimize its completion time. The queueing model developed here allows us to investigate the dynamic performance of co-scheduling algorithms from the system point of view under several queueing strategies. We use six different queueing strategies in combination with co-scheduling, and compare the results to the M/M/m system, where arriving tasks would be assigned to workstations as whole computations, and no co-scheduling would take place. The results show that the co-scheduling approach is viable under a wide range of system parameters. Moreover, performance differences of queueing strategies tend to diminish as the number of workstations grows. This suggests that co-scheduling is universally applicable across the queueing disciplines considered here when there are a large number of workstations.

[1]  Miron Livny,et al.  Condor-a hunter of idle workstations , 1988, [1988] Proceedings. The 8th International Conference on Distributed.

[2]  Margaret A. Schaar,et al.  Performance of co-scheduling on a network of workstations , 1993, [1993] Proceedings. The 13th International Conference on Distributed Computing Systems.

[3]  Çetin Kaya Koç,et al.  Parallel matrix multiplication on networked microcomputers , 1992 .

[4]  Claude Kaiser,et al.  Distributed computing systems , 1986 .

[5]  Bruce M. McMillin,et al.  DAWGS - A Distributed Compute Server Utilizing Idle Workstations , 1992, J. Parallel Distributed Comput..

[6]  Dimitri P. Bertsekas,et al.  Data Networks , 1986 .

[7]  Fred Douglis,et al.  Process Migration in the Sprite Operating System , 1987, ICDCS.

[8]  Kemal Efe,et al.  Optimal Scheduling of Compute-Intensive Tasks on a Network of Workstations , 1995, IEEE Trans. Parallel Distributed Syst..

[9]  R. Chawla,et al.  The Stealth distributed scheduler , 1991, [1991] Proceedings. 11th International Conference on Distributed Computing Systems.

[10]  Dan C. Marinescu,et al.  Models and Algorithms for Coscheduling Compute-Intensive Tasks on a Network of Workstations , 1992, J. Parallel Distributed Comput..

[11]  Robert B. Hagmann,et al.  Process Server: Sharing Processing Power in a Workstation Environment , 1986, ICDCS.

[12]  Donald F. Towsley,et al.  Imbedding gradient estimators in load balancing algorithms , 1988, [1988] Proceedings. The 8th International Conference on Distributed.

[13]  Andrew S. Tanenbaum,et al.  An overview of the Amoeba distributed operating system , 1981, OPSR.

[14]  Andrew R. Cherenson,et al.  The Sprite network operating system , 1988, Computer.

[15]  Weijia Shang,et al.  Queueing performance analysis of co-scheduling in a pool of processors environment , 1994, ICS '94.

[16]  Leonard Kleinrock,et al.  Collecting Unused Processing Capacity: An Analysis of Transient Distributed Systems , 1993, IEEE Trans. Parallel Distributed Syst..