Adaptive Scheduling under Memory Pressure on Multiprogrammed Clusters

This paper presents scheduling techniques that enable the adaptation of parallel programs to clustered computational farms with limited memory capacity. The purpose of the techniques is to coschedule communicating processes and prevent paging, using two cooperating extensions to the kernel scheduler. A paging prevention module enables memory-bound programs to adapt to memory short-age, by suspending their threads at well-defined execution points. The associated operating system interface provides a generic mechanism that enables programs to adapt in different ways, including application-specific forms of adaptation. At the same time, a dynamic coscheduling heuristic implemented in the kernel scheduler increases periodically the priority of communicating processes so that parallel jobs are eased through communication points. We show that when a parallel job competes with randomized sequential load running in the nodes of the cluster, the combination of coscheduling and paging prevention reduces drastically the slowdown of the job at high levels of memory utilization. We also show that if memory resources are ample, coscheduling should take priority over paging prevention, whereas if memory resources are scarce, preventing paging should take priority over coscheduling.

[1]  Brian N. Bershad,et al.  Scheduler activations: effective kernel support for the user-level management of parallelism , 1991, TOCS.

[2]  Andrea C. Arpaci-Dusseau,et al.  Implicit coscheduling: coordinated scheduling with implicit information in distributed systems , 2001, TOCS.

[3]  John Zahorjan,et al.  Scheduling memory constrained jobs on distributed memory parallel computers , 1995, SIGMETRICS '95/PERFORMANCE '95.

[4]  Evgenia Smirni,et al.  Algorithmic modifications to the Jacobi-Davidson parallel eigensolver to dynamically balance external CPU and memory load , 2001, ICS '01.

[5]  John K. Ousterhout,et al.  Scheduling Techniques for Concurrent Systems , 1982, ICDCS.

[6]  Li Xiao,et al.  Dynamic load sharing with unknown memory demands in clusters , 2001, Proceedings 21st International Conference on Distributed Computing Systems.

[7]  Ioannis E. Venetis,et al.  A Transparent Operating System Infrastructure for Embedding Adaptability to Thread-Based Programming Models , 2001, Euro-Par.

[8]  Sanjeev Setia,et al.  The Interaction between Memory Allocation and Adaptive Partitioning in Message-Passing Multicomputers , 1995, JSSPP.

[9]  Emilio Luque,et al.  Coscheduling under Memory Constraints in a NOW Environment , 2001, JSSPP.

[10]  Sanjeev Setia,et al.  Availability and utility of idle memory in workstation clusters , 1999, SIGMETRICS '99.

[11]  Song Jiang,et al.  Adaptive Page Replacement to Protect Thrashing in Linux , 2001, Annual Linux Showcase & Conference.

[12]  John K. Ousterhout Scheduling Techniques for Concurrebt Systems. , 1982, ICDCS 1982.

[13]  Chita R. Das,et al.  A closer look at coscheduling approaches for a network of workstations , 1999, SPAA '99.

[14]  Mor Harchol-Balter,et al.  Exploiting process lifetime distributions for dynamic load balancing , 1995, SIGMETRICS.

[15]  Dimitrios S. Nikolopoulos,et al.  Adaptive scheduling under memory pressure on multiprogrammed SMPs , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[16]  Dror G. Feitelson,et al.  Gang scheduling with memory considerations , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.

[17]  Kenneth C. Sevcik,et al.  Coordinated allocation of memory and processors in multiprocessors , 1996, SIGMETRICS '96.

[18]  Li Xiao,et al.  Improving distributed workload performance by sharing both CPU and memory resources , 2000, Proceedings 20th IEEE International Conference on Distributed Computing Systems.