Towards Convergence in Job Schedulers for Parallel Supercomputers

The space of job schedulers for parallel supercomputers is rather fragmented, because different researchers tend to make different assumptions about the goals of the scheduler, the information that is available about the workload, and the operations that the scheduler may perform. We argue that by identifying these assumptions explicitly, it is possible to reach a level of convergence. For example, it is possible to unite most of the different assumptions into a common framework by associating a suitable cost function with the execution of each job. The cost function reflects knowledge about the job and the degree to which it fits the goals of the system. Given such cost functions, scheduling is done to maximize the system's profit.

[1]  Per Brinch Hansen An Analysis of Response Ratio Scheduling , 1971, IFIP Congress.

[2]  John K. Ousterhout,et al.  Scheduling Techniques for Concurrent Systems , 1982, ICDCS.

[3]  Abraham Silberschatz,et al.  Operating System Concepts , 1983 .

[4]  Larry Rudolph,et al.  Issues Related to MIMD Shared-memory Computers: The NYU Ultracomputer Approach , 1985, ISCA.

[5]  Kenneth C. Sevcik Characterizations of parallelism in applications and their use in scheduling , 1989, SIGMETRICS '89.

[6]  Anoop Gupta,et al.  Process control and scheduling issues for multiprogrammed shared-memory multiprocessors , 1989, SOSP '89.

[7]  Ming-Syan Chen,et al.  Subcube Allocation and Task Migration in Hypercube Multiprocessors , 1990, IEEE Trans. Computers.

[8]  John Zahorjan,et al.  Processor scheduling in shared memory multiprocessors , 1990, SIGMETRICS '90.

[9]  Larry Rudolph,et al.  Distributed hierarchical control for parallel processing , 1990, Computer.

[10]  Evangelos P. Markatos,et al.  Multiprogramming on multiprocessors , 1991, Proceedings of the Third IEEE Symposium on Parallel and Distributed Processing.

[11]  Anoop Gupta,et al.  The impact of operating system scheduling policies and synchronization methods of performance of parallel applications , 1991, SIGMETRICS '91.

[12]  Brian N. Bershad,et al.  Scheduler activations: effective kernel support for the user-level management of parallelism , 1991, TOCS.

[13]  Larry Rudolph,et al.  Gang Scheduling Performance Benefits for Fine-Grain Synchronization , 1992, J. Parallel Distributed Comput..

[14]  Murthy V. Devarakonda,et al.  Issues in implementation of cache-affinity scheduling , 1992 .

[15]  Dhiraj K. Pradhan,et al.  A fast and efficient strategy for submesh allocation in mesh-connected parallel computers , 1993, Proceedings of 1993 5th IEEE Symposium on Parallel and Distributed Processing.

[16]  Jack Dongarra,et al.  Pvm 3 user's guide and reference manual , 1993 .

[17]  Mark S. Squillante,et al.  Using Processor-Cache Affinity Information in Shared-Memory Multiprocessor Scheduling , 1993, IEEE Trans. Parallel Distributed Syst..

[18]  Viktor K. Prasanna,et al.  Heterogeneous computing: challenges and opportunities , 1993, Computer.

[19]  Raj Vaswani,et al.  A dynamic processor allocation policy for multiprogrammed shared-memory multiprocessors , 1993, TOCS.

[20]  J. Zahorjan,et al.  Processor allocation policies for message-passing parallel computers , 1994, Measurement and Modeling of Computer Systems.

[21]  Phillip Krueger,et al.  ob Scheduling is More Important than Processor Allocation for Hypercube Computers , 1994, IEEE Trans. Parallel Distributed Syst..

[22]  Kenneth C. Sevcik,et al.  Application Scheduling and Processor Allocation in Multiprogrammed Parallel Processing Systems , 1994, Perform. Evaluation.

[23]  Bill Nitzberg,et al.  Non-contiguous processor allocation algorithms for distributed memory multicomputers , 1994, Proceedings of Supercomputing '94.

[24]  Giuseppe Serazzi,et al.  Robust Partitioning Policies of Multiprocessor Systems , 1994, Perform. Evaluation.

[25]  David A. Lifka,et al.  The ANL/IBM SP Scheduling System , 1995, JSSPP.

[26]  Robert L. Henderson,et al.  Job Scheduling Under the Portable Batch System , 1995, JSSPP.

[27]  Kenneth C. Sevcik,et al.  Multiprocessor Scheduling for High-Variability Service Time Distributions , 1995, JSSPP.

[28]  Dror G. Feitelson,et al.  Job Characteristics of a Production Parallel Scientivic Workload on the NASA Ames iPSC/860 , 1995, JSSPP.

[29]  Nicholas Carriero,et al.  Adaptive Parallelism and Piranha , 1995, Computer.

[30]  Josep Torrellas,et al.  Evaluating the Performance of Cache-Affinity Scheduling in Shared-Memory Multiprocessors , 1995, J. Parallel Distributed Comput..

[31]  Larry Rudolph,et al.  Parallel Job Scheduling: Issues and Approaches , 1995, JSSPP.

[32]  Shuichi Sakai,et al.  Time Space Sharing Scheduling and Architectural Support , 1995, JSSPP.

[33]  Giuseppe Serazzi,et al.  Analysis of Non-Work-Conserving Processor Partitioning Policies , 1995, JSSPP.

[34]  B. Gorda,et al.  Time sharing massively parallel machines. Draft , 1995 .

[35]  A. Hori Time Space Sharing Scheduling and Architectural Support, Job Scheduling Strategies for Parallel Processing , 1995 .

[36]  Tilak Agerwala,et al.  SP2 System Architecture , 1999, IBM Syst. J..

[37]  Richard Wolski,et al.  Time Sharing Massively Parallel Machines , 1995, ICPP.

[38]  Miron Livny,et al.  Parallel Processing on Dynamic Resources with CARMI , 1995, JSSPP.

[39]  Nawaf Bitar,et al.  A Scalable Multi-Discipline, Multiple-Processor Scheduling Framework for IRIX , 1995, JSSPP.

[40]  Larry Rudolph,et al.  Evaluation of Design Choices for Gang Scheduling Using Distributed Hierarchical Control , 1996, J. Parallel Distributed Comput..

[41]  Steven Hotovy,et al.  Workload Evolution on the Cornell Theory Center IBM SP2 , 1996, JSSPP.

[42]  Thu D. Nguyen,et al.  Using Runtime Measured Workload Characteristics in Parallel Processor Scheduling , 1996, JSSPP.

[43]  Jörn Gehring,et al.  Architecture-Independent Request-Scheduling with Tight Waiting-Time Estimations , 1996, JSSPP.

[44]  Dror G. Feitelson,et al.  Packing Schemes for Gang Scheduling , 1996, JSSPP.

[45]  Jitendra Padhye,et al.  Dynamic versus Adaptive Processor Allocation Policies for Message Passing Parallel Computers: An Empirical Comparison , 1996, JSSPP.

[46]  Thu D. Nguyen,et al.  Parallel Application Characteristics for Multiprocessor Scheduling Policy Design , 1996, JSSPP.

[47]  Mark S. Squillante,et al.  Dynamic Partitioning in Different Distributed-Memory Environments , 1996, JSSPP.

[48]  José M. Bernabéu-Aubán,et al.  Solaris MC: A Multi Computer OS , 1996, USENIX Annual Technical Conference.

[49]  Honbo Zhou,et al.  The EASY - LoadLeveler API Project , 1996, JSSPP.

[50]  Reagan Moore,et al.  A Batch Scheduler for the Intel Paragon MPP System with a Non-contiguous Node Allocation Algorithm , 1996, JSSPP.

[51]  Miron Livny,et al.  Managing Checkpoints for Parallel Programs , 1996, JSSPP.

[52]  Yutaka Ishikawa,et al.  Implementation of Gang-Scheduling on Workstation Cluster , 1996, JSSPP.

[53]  Marios C. Papaefthymiou,et al.  A Gang Scheduling Design for Multiprogrammed Parallel Computing Environments , 1996, JSSPP.

[54]  Mary K. Vernon,et al.  Dynamic vs. Static Quantum-Based Parallel Processor Allocation , 1996, JSSPP.

[55]  I. Stravinsky,et al.  Gestural Control of Real-Time Concatenative Synthesis in Luna Park Grégory Beller Computer Music , 2011 .