Scheduling resources in multi-user, heterogeneous, computing environments with SmartNet

It is increasingly common for computer users to have access to several computers on a network, and hence to be able to execute many of their tasks on any of several computers. The choice of which computers execute which tasks is commonly determined by users based on a knowledge of computer speeds for each task and the current load on each computer. A number of task scheduling systems have been developed that balance the load of the computers on the network, but such systems tend to minimize the idle time of the computers rather than minimize the idle time of the users. The paper focuses on the benefits that can be achieved when the scheduling system considers both the computer availabilities and the performance of each task on each computer. The SmartNet resource scheduling system is described and compared to two different resource allocation strategies: load balancing and user directed assignment. Results are presented where the operation of hundreds of different networks of computers running thousands of different mixes of tasks are simulated in a batch environment. These results indicate that, for the computer environments simulated, SmartNet outperforms both load balancing and user directed assignments, based on the maximum time users must wait for their tasks to finish.

[1]  Jr. Allen B. Tucker,et al.  The Computer Science and Engineering Handbook , 1997 .

[2]  L Nelson Michael,et al.  A Comparison of Queueing, Cluster and Distributed Computing Systems , 1994 .

[3]  Yang Xiaodong,et al.  An E ective and Practical Performance Prediction Model forParallel Computing on Non-dedicated Heterogeneous NOW , 1996 .

[4]  Nicholas S. Flann,et al.  A massively parallel SIMD algorithm for combinatorial optimization , 1996, Proceedings of the 1996 ICPP Workshop on Challenges for Parallel Processing.

[5]  Hesham El-Rewini,et al.  Scheduling Parallel Program Tasks onto Arbitrary Target Machines , 1990, J. Parallel Distributed Comput..

[6]  David B. Fogel,et al.  Evolutionary Computation: Towards a New Philosophy of Machine Intelligence , 1995 .

[7]  Vikram S. Adve,et al.  Analyzing the behavior and performance of parallel programs , 1993 .

[8]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[9]  Andrew S. Grimshaw,et al.  Portable run-time support for dynamic object-oriented parallel processing , 1996, TOCS.

[10]  David S. Johnson,et al.  Computers and In stractability: A Guide to the Theory of NP-Completeness. W. H Freeman, San Fran , 1979 .

[11]  Andrew S. Grimshaw,et al.  The Legion vision of a worldwide virtual computer , 1997, Commun. ACM.

[12]  R. F. Freund,et al.  SmartNet: a scheduling framework for heterogeneous computing , 1996, Proceedings Second International Symposium on Parallel Architectures, Algorithms, and Networks (I-SPAN'96).

[13]  R. F. Freund,et al.  Augmenting the Optimal Selection Theory for Superconcurrency , 1992, Proceedings. Workshop on Heterogeneous Processing.

[14]  David H. Bailey,et al.  NAS parallel benchmark results , 1992, Proceedings Supercomputing '92.

[15]  Henry G. Dietz,et al.  Would You Run it Here or There? AHS: Automatic Heterogeneous Supercomputing , 1993, 1993 International Conference on Parallel Processing - ICPP'93.

[16]  Richard F. Freund,et al.  Generational Scheduling for Heterogeneous Computing System , 1996, PDPTA.

[17]  R. F. Freund,et al.  Optimal selection theory for superconcurrency , 1989, Proceedings of the 1989 ACM/IEEE Conference on Supercomputing (Supercomputing '89).

[18]  Vaidy S. Sunderam,et al.  PVM: A Framework for Parallel Distributed Computing , 1990, Concurr. Pract. Exp..

[19]  Behrooz Shirazi,et al.  Analysis and Evaluation of Heuristic Methods for Static Task Scheduling , 1990, J. Parallel Distributed Comput..

[20]  Mor Harchol-Balter,et al.  Exploiting process lifetime distributions for dynamic load balancing , 1995, SIGMETRICS.

[21]  YONG YAN,et al.  An Effective and Practical Performance Prediction Model for Parallel Computing on Nondedicated Heterogeneous NOW , 1996, J. Parallel Distributed Comput..

[22]  Thomas Bäck,et al.  Evolutionary computation: Toward a new philosophy of machine intelligence , 1997, Complex..

[23]  Ewing L. Lusk,et al.  Monitors, Messages, and Clusters: The p4 Parallel Programming System , 1994, Parallel Comput..

[24]  Jack J. Dongarra,et al.  HeNCE: A Heterogeneous Network Computing Environment , 1994, Sci. Program..

[25]  Oscar H. Ibarra,et al.  Heuristic Algorithms for Scheduling Independent Tasks on Nonidentical Processors , 1977, JACM.

[26]  Satish K. Tripathi,et al.  Static and Dynamic Processor Scheduling Disciplines in Heterogeneous Parallel Architectures , 1995, J. Parallel Distributed Comput..

[27]  R. F. Freund,et al.  Guest Editor's Introduction: Heterogeneous Processing , 1993 .

[28]  Xiaodong Zhang,et al.  Erratum: "An Effective and Practical Performance Prediction Model for Parallel Computing on Nondedicated Heterogeneous NOW" , 1997, J. Parallel Distributed Comput..

[29]  Francine Berman,et al.  Modeling the effects of contention on the performance of heterogeneous applications , 1996, Proceedings of 5th IEEE International Symposium on High Performance Distributed Computing.

[30]  Hesham H. Ali,et al.  Task scheduling in parallel and distributed systems , 1994, Prentice Hall series in innovative technology.

[31]  John K. Antonio,et al.  Software support for heterogeneous computing , 1996, CSUR.

[32]  Francine Berman,et al.  Scheduling from the perspective of the application , 1996, Proceedings of 5th IEEE International Symposium on High Performance Distributed Computing.

[33]  Hesham El-Rewini,et al.  Parallax: a tool for parallel program scheduling , 1993, IEEE Parallel & Distributed Technology: Systems & Applications.

[34]  R. F. Freund SuperC or distributed heterogeneous HPC , 1991 .

[35]  Harvey M. Deitel,et al.  An introduction to operating systems , 1984 .

[36]  Jon B. Weissman,et al.  The Interference Paradigm for Network Job Scheduling , 1996 .