Theory and Practice in Parallel Job Scheduling

The scheduling of jobs on parallel supercomputer is becoming the subject of much research. However, there is concern about the divergence of theory and practice. We review theoretical research in this area, and recommendations based on recent results. This is contrasted with a proposal for standard interfaces among the components of a scheduling system, that has grown from requirements in the field.

[1]  Dror G. Feitelson,et al.  Memory Usage in the LANL CM-5 Workload , 1997, JSSPP.

[2]  Satish K. Tripathi,et al.  The Processor Working Set and Its Use in Scheduling Multiprocessor Systems , 1991, IEEE Trans. Software Eng..

[3]  John Zahorjan,et al.  Zahorjan processor allocation policies for message-passing parallel computers , 1994, SIGMETRICS 1994.

[4]  V. K. Naik,et al.  Performance analysis of job scheduling policies in parallel supercomputing environments , 1993, Supercomputing '93.

[5]  Cynthia A. Phillips,et al.  Improved Scheduling Algorithms for Minsum Criteria , 1996, ICALP.

[6]  Stergios V. Anastasiadis,et al.  Parallel Application Scheduling on Networks of Workstations , 1997, J. Parallel Distributed Comput..

[7]  Kenneth C. Sevcik,et al.  Coordinated allocation of memory and processors in multiprocessors , 1996, SIGMETRICS '96.

[8]  William L. Maxwell,et al.  Theory of scheduling , 1967 .

[9]  Thu D. Nguyen,et al.  Parallel Application Characteristics for Multiprocessor Scheduling Policy Design , 1996, JSSPP.

[10]  Philip S. Yu,et al.  Smart SMART Bounds for Weighted Response Time Scheduling , 1999, SIAM J. Comput..

[11]  Victor Lee,et al.  Implications of I/O for Gang Scheduled Workloads , 1997, JSSPP.

[12]  Dror G. Feitelson,et al.  Packing Schemes for Gang Scheduling , 1996, JSSPP.

[13]  Miron Livny,et al.  Parallel Processing on Dynamic Resources with CARMI , 1995, JSSPP.

[14]  Tim Brecht,et al.  An Experimental Evaluation of Processor Pool-Based Scheduling for Shared-Memory NUMA Multiprocessors , 1997, JSSPP.

[15]  William E. Weihl,et al.  Lottery scheduling: flexible proportional-share resource management , 1994, OSDI '94.

[16]  Kenneth C. Sevcik,et al.  Application Scheduling and Processor Allocation in Multiprogrammed Parallel Processing Systems , 1994, Perform. Evaluation.

[17]  Mary K. Vernon,et al.  The performance of multiprogrammed multiprocessor scheduling algorithms , 1990, SIGMETRICS '90.

[18]  Steven Hotovy,et al.  Workload Evolution on the Cornell Theory Center IBM SP2 , 1996, JSSPP.

[19]  Kenneth C. Sevcik,et al.  Implementing Multiprocessor Scheduling Disciplines , 1997, JSSPP.

[20]  Kam-Hoi Cheng,et al.  A Heuristic of Scheduling Parallel Tasks and its Analysis , 1992, SIAM J. Comput..

[21]  Giuseppe Serazzi,et al.  Performance Gains from Leaving Idle Processors in Multiprocessor Systems , 1995, ICPP.

[22]  Philip S. Yu,et al.  Approximate algorithms scheduling parallelizable tasks , 1992, SPAA '92.

[23]  Dror G. Feitelson,et al.  Improved Utilization and Responsiveness with Gang Scheduling , 1997, JSSPP.

[24]  Larry Rudolph,et al.  Towards Convergence in Job Schedulers for Parallel Supercomputers , 1996, JSSPP.

[25]  John Zahorjan,et al.  Scheduling memory constrained jobs on distributed memory parallel computers , 1995, SIGMETRICS '95/PERFORMANCE '95.

[26]  David B. Shmoys,et al.  Improved approximation algorithms for minsum criteria , 1996 .

[27]  Nawaf Bitar,et al.  A Scalable Multi-Discipline, Multiple-Processor Scheduling Framework for IRIX , 1995, JSSPP.

[28]  Joseph Y.-T. Leung,et al.  Complexity of Scheduling Parallel Task Systems , 1989, SIAM J. Discret. Math..

[29]  A. Hori Time Space Sharing Scheduling and Architectural Support, Job Scheduling Strategies for Parallel Processing , 1995 .

[30]  Robert L. Henderson,et al.  Job Scheduling Under the Portable Batch System , 1995, JSSPP.

[31]  Chris N. Potts,et al.  Scheduling Identical Parallel Machines to Minimize Total Weighted Completion Time , 1994, Discret. Appl. Math..

[32]  Sartaj Sahni,et al.  Algorithms for Scheduling Independent Tasks , 1976, J. ACM.

[33]  Allen B. Downey,et al.  Using Queue Time Predictions for Processor Allocation , 1997, JSSPP.

[34]  Andrea C. Arpaci-Dusseau,et al.  Effective distributed scheduling of parallel workloads , 1996, SIGMETRICS '96.

[35]  Larry Rudolph,et al.  Gang Scheduling Performance Benefits for Fine-Grain Synchronization , 1992, J. Parallel Distributed Comput..

[36]  Ramesh Krishnamurti An Approximation Algorithm for Scheduling Tasks on Varying Partition Sizes in Partitionable Multiprocessor Systems , 1992, IEEE Trans. Computers.

[37]  Paul Messina The concurrent supercomputing consortium: Year 1 , 1993, IEEE Parallel & Distributed Technology: Systems & Applications.

[38]  Carl A. Waldspurger,et al.  Lottery and stride scheduling: flexible proportional-share resource management , 1995 .

[39]  Giuseppe Serazzi,et al.  Robust Partitioning Policies of Multiprocessor Systems , 1994, Perform. Evaluation.

[40]  Anoop Gupta,et al.  Process control and scheduling issues for multiprogrammed shared-memory multiprocessors , 1989, SOSP '89.

[41]  Mark S. Squillante,et al.  Analysis of the Impact of Memory in Distributed Parallel Processing Systems , 1994, SIGMETRICS.

[42]  Sanjeev Setia,et al.  The Interaction between Memory Allocation and Adaptive Partitioning in Message-Passing Multicomputers , 1995, JSSPP.

[43]  Robert McNaughton,et al.  Scheduling with Deadlines and Loss Functions , 1959 .

[44]  Patrick Sobalvarro,et al.  Demand-Based Coscheduling of Parallel Jobs on Multiprogrammed Multiprocessors , 1995, JSSPP.

[45]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[46]  Kenneth C. Sevcik,et al.  Multiprocessor Scheduling for High-Variability Service Time Distributions , 1995, JSSPP.

[47]  Marios C. Papaefthymiou,et al.  Stochastic Analysis of Gang Scheduling in Parallel and Distributed Systems , 1996, Perform. Evaluation.

[48]  Mary K. Vernon,et al.  Use of application characteristics and limited preemption for run-to-completion parallel processor scheduling policies , 1994, SIGMETRICS.

[49]  Ronald L. Graham,et al.  Bounds on Multiprocessing Timing Anomalies , 1969, SIAM Journal of Applied Mathematics.

[50]  Richard N. Lagerstrom,et al.  PScheD: Political Scheduling on the CRAY T3E , 1997, JSSPP.

[51]  Philip S. Yu,et al.  Scheduling parallelizable tasks to minimize average response time , 1994, SPAA '94.

[52]  Giuseppe Serazzi,et al.  Analysis of Non-Work-Conserving Processor Partitioning Policies , 1995, JSSPP.

[53]  Kenneth C. Sevcik,et al.  Benefits of Speedup Knowledge in Memory-Constrained Multiprocessor Scheduling , 1996, Perform. Evaluation.

[54]  Dror G. Feitelson,et al.  Job Characteristics of a Production Parallel Scientivic Workload on the NASA Ames iPSC/860 , 1995, JSSPP.

[55]  Kenneth C. Sevcik Characterizations of parallelism in applications and their use in scheduling , 1989, SIGMETRICS '89.

[56]  John Zahorjan,et al.  Processor scheduling in shared memory multiprocessors , 1990, SIGMETRICS '90.

[57]  Mark S. Squillante,et al.  Dynamic Partitioning in Different Distributed-Memory Environments , 1996, JSSPP.

[58]  Wayne E. Smith Various optimizers for single‐stage production , 1956 .

[59]  Lawrence W. Dowdy,et al.  Dynamic partitioning in a transputer environment , 1990, SIGMETRICS '90.

[60]  Shuichi Sakai,et al.  Time Space Sharing Scheduling and Architectural Support , 1995, JSSPP.

[61]  Satish K. Tripathi,et al.  Processor scheduling on multiprogrammed, distributed memory parallel computers , 1993, SIGMETRICS '93.

[62]  David A. Lifka,et al.  The ANL/IBM SP Scheduling System , 1995, JSSPP.

[63]  Larry Rudolph,et al.  Evaluation of Design Choices for Gang Scheduling Using Distributed Hierarchical Control , 1996, J. Parallel Distributed Comput..

[64]  Honbo Zhou,et al.  The EASY - LoadLeveler API Project , 1996, JSSPP.

[65]  Gerhard J. Woeginger,et al.  Approximability and nonapproximability results for minimizing total flow time on a single machine , 1996, STOC '96.

[66]  Anja Feldmann,et al.  Dynamic scheduling on parallel machines , 1991, [1991] Proceedings 32nd Annual Symposium of Foundations of Computer Science.

[67]  Marios C. Papaefthymiou,et al.  A Gang Scheduling Design for Multiprogrammed Parallel Computing Environments , 1996, JSSPP.

[68]  Jitendra Padhye,et al.  Dynamic versus Adaptive Processor Allocation Policies for Message Passing Parallel Computers: An Empirical Comparison , 1996, JSSPP.

[69]  Asser N. Tantawi,et al.  Performance analysis of parallel processing systems , 1987, SIGMETRICS '87.

[70]  Miron Livny,et al.  Managing Checkpoints for Parallel Programs , 1996, JSSPP.

[71]  Edward D. Lazowska,et al.  Speedup Versus Efficiency in Parallel Systems , 1989, IEEE Trans. Computers.

[72]  Philip S. Yu,et al.  Scheduling parallel tasks to minimize average response time , 1994, SODA '94.

[73]  David P. Williamson,et al.  Scheduling parallel machines on-line , 1991, [1991] Proceedings 32nd Annual Symposium of Foundations of Computer Science.

[74]  Rajeev Motwani,et al.  Non-clairvoyant scheduling , 1994, SODA '93.

[75]  Mark S. Squillante,et al.  On the Benefits and Limitations of Dynamic Partitioning in Parallel Computer Systems , 1995, JSSPP.

[76]  Stefano Leonardi,et al.  Approximating total flow time on parallel machines , 1997, STOC '97.

[77]  Robert E. Tarjan,et al.  Amortized efficiency of list update and paging rules , 1985, CACM.

[78]  Shikharesh Majumdar,et al.  Scheduling in multiprogrammed parallel systems , 1988, SIGMETRICS 1988.

[79]  David J. Lilja,et al.  Loop-Level Process Control: An Effective Processor Allocation Policy for Multiprogrammed Shared-Memory Multiprocessors , 1995, JSSPP.

[80]  Seiki Kyan,et al.  Worst Case Bound of an LRF Schedule for the Mean Weighted Flow-Time Problem , 1986, SIAM J. Comput..

[81]  Richard Wolski,et al.  Time Sharing Massively Parallel Machines , 1995, ICPP.

[82]  Prasoon Tiwari,et al.  Scheduling malleable and nonmalleable parallel tasks , 1994, SODA '94.

[83]  Marios C. Papaefthymiou,et al.  Performance Evaluation of Gang Scheduling for Parallel and Distributed Multiprogramming , 1997, JSSPP.

[84]  Ishfaq Ahmad,et al.  Editorial: Resource management of parallel and distributed systems with static scheduling: Challenges, solutions and new problems , 1995, Concurr. Pract. Exp..

[85]  Fang Wang,et al.  Modeling of Workload in MPPs , 1997, JSSPP.

[86]  Andrew S. Grimshaw,et al.  Metasystems: An Approach Combining Parallel Processing and Heterogeneous Distributed Computing Systems , 1994, J. Parallel Distributed Comput..

[87]  Edward G. Coffman,et al.  Scheduling independent tasks to reduce mean finishing time , 1974, CACM.

[88]  Uwe Schwiegelshohn Preemptive Weighted Completion Time Scheduling of Parallel Jobs , 1996, ESA.

[89]  Richard Gibbons,et al.  A Historical Application Profiler for Use by Parallel Schedulers , 1997, JSSPP.

[90]  Mary K. Vernon,et al.  Dynamic vs. Static Quantum-Based Parallel Processor Allocation , 1996, JSSPP.

[91]  W. Daniel Hillis,et al.  The Network Architecture of the Connection Machine CM-5 , 1996, J. Parallel Distributed Comput..

[92]  Thu D. Nguyen,et al.  Using Runtime Measured Workload Characteristics in Parallel Processor Scheduling , 1996, JSSPP.

[93]  Simon Kahan,et al.  Scheduling on the Tera MTA , 1995, JSSPP.

[94]  L. Rudolph,et al.  Gang scheduling for highly efficient, distributed multiprocessor systems , 1996, Proceedings of 6th Symposium on the Frontiers of Massively Parallel Computation (Frontiers '96).

[95]  Satish K. Tripathi,et al.  Analysis of Processor Allocation in Multiprogrammed, Distributed-Memory Parallel Processing Systems , 1994, IEEE Trans. Parallel Distributed Syst..