Kernel-level scheduling for the nano-threads programming model

Multiprocessor systems are increasingly becoming the systems of choice for low and high-end servers, running such diverse tasks as number crunching, large-scale simulations, data base engines and world wide web server applications. With such diverse workloads, system utilization and throughput, as well as execution time become important performance metrics. In this paper we present efficient kernel scheduling policies and propose a new kernel-user interface aiming at supporting efficient parallel execution in diverse workload environments. Our approach relies on support for user level threads which are used to exploit parallelism within applications, and a two-level scheduling policy which coordinates the number of resources allocated by the kernel with the number of threads generated by each application. We compare our scheduling policies with the native gang scheduling policy of the IRIX 6.4 operating system on a Silicon Graphics Origin2000. Our experimental results show substantial performance gains in terms of overall workload execution times, individual application execution times, and cache performance.

[1]  D. Lenoski,et al.  The SGI Origin: A ccnuma Highly Scalable Server , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[2]  Milind Girkar,et al.  Automatic Extraction of Functional Parallelism from Ordinary Programs , 1992, IEEE Trans. Parallel Distributed Syst..

[3]  John K. Ousterhout,et al.  Scheduling Techniques for Concurrent Systems , 1982, ICDCS.

[4]  Andrew Gilliam Tucker,et al.  Efficient Scheduling on Multiprogrammed Shared-Memory Multiprocessors , 1994 .

[5]  Anoop Gupta,et al.  The impact of operating system scheduling policies and synchronization methods of performance of parallel applications , 1991, SIGMETRICS '91.

[6]  Eduard Ayguadé,et al.  Exploiting Parallelism Through Directives on the Nano-Threads Programming Model , 1997, LCPC.

[7]  Anoop Gupta,et al.  Process control and scheduling issues for multiprogrammed shared-memory multiprocessors , 1989, SOSP '89.

[8]  Eduard Ayguadé,et al.  A Library Implementation of the Nano-Threads Programming Model , 1996, Euro-Par, Vol. II.

[9]  Dror G. Feitelson,et al.  Job Scheduling in Multiprogrammed Parallel Systems , 1997 .

[10]  Milind Girkar,et al.  Parafrase-2: an Environment for Parallelizing, Partitioning, Synchronizing, and Scheduling Programs on Multiprocessors , 1989, Int. J. High Speed Comput..

[11]  John K. Ousterhout Scheduling Techniques for Concurrebt Systems. , 1982, ICDCS 1982.

[12]  Shikharesh Majumdar,et al.  Scheduling in multiprogrammed parallel systems , 1988, SIGMETRICS 1988.

[13]  Brian N. Bershad,et al.  Scheduler activations: effective kernel support for the user-level management of parallelism , 1991, TOCS.

[14]  Xavier Martorell,et al.  Nano-Threads Library Design, Implementation and Evaluation , 1995 .

[15]  William J. Bolosky,et al.  Mach: A New Kernel Foundation for UNIX Development , 1986, USENIX Summer.

[16]  Shikharesh Majumdar,et al.  Scheduling in multiprogrammed parallel systems , 1988, SIGMETRICS '88.

[17]  Anoop Gupta,et al.  Operating system support for improving data locality on CC-NUMA compute servers , 1996, ASPLOS VII.

[18]  Evangelos P. Markatos,et al.  First-class user-level threads , 1991, SOSP '91.