论文信息 - Adaptive scheduling under memory constraints on non-dedicated computationalfarms

Adaptive scheduling under memory constraints on non-dedicated computationalfarms

This paper presents scheduler extensions that enable better adaptation of parallel programs to the execution conditions of non-dedicated computational farms with limited memory resources. The purpose of the techniques is to prevent thrashing and co-schedule communicating threads, using two disjoint, yet cooperating extensions to the kernel scheduler. A thrashing prevention module enables memory-bound programs to adapt to memory shortage, via suspending their threads at selected points of execution. Thread suspension is used so that memory is not over-committed by parallel jobs--which are assumed to be running as guests on the nodes of the computational farm--at memory allocation points. In the event of thrashing, parallel jobs are the first to release memory and help local resident jobs make progress. Adaptation is implemented using a shared-memory interface in the/proc filesystem and upcalls from the kernel to the user space. On an orthogonal axis, co-scheduling is implemented in the kernel with a heuristic that boosts periodically the priority of communicating threads.Using experiments on a cluster of workstations, we show that when a guest parallel job competes with general-purpose interactive, I/O-intensive, or CPU and memory-intensive load on the nodes of the cluster, thrashing prevention reduces drastically the slowdown of the job at memory utilization levels of 20% or higher. The slowdown of parallel jobs is reduced by up to a factor of 7. Co-scheduling provides a limited performance improvement at memory utilization levels below 20%, but has no significant effect at higher memory utilization levels.

Dimitrios S. Nikolopoulos | Constantine D. Polychronopoulos

[1] Dimitrios S. Nikolopoulos,et al. Adaptive scheduling under memory pressure on multiprogrammed SMPs , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[2] Dror G. Feitelson,et al. Gang scheduling with memory considerations , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.

[3] Dimitrios S. Nikolopoulos,et al. Adaptive Scheduling under Memory Pressure on Multiprogrammed Clusters , 2002, 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID'02).

[4] Jeffrey K. Hollingsworth,et al. Mechanisms and policies for supporting fine-grained cycle stealing , 1999, ICS '99.

[5] Sanjeev Setia,et al. The Interaction between Memory Allocation and Adaptive Partitioning in Message-Passing Multicomputers , 1995, JSSPP.

[6] John K. Ousterhout,et al. Scheduling Techniques for Concurrent Systems , 1982, ICDCS.

[7] Scott Pakin,et al. Dynamic Coscheduling on Workstation Clusters , 1998, JSSPP.

[8] Emilio Luque,et al. Coscheduling under Memory Constraints in a NOW Environment , 2001, JSSPP.

[9] Sanjeev Setia,et al. Availability and utility of idle memory in workstation clusters , 1999, SIGMETRICS '99.

[10] Song Jiang,et al. Adaptive Page Replacement to Protect Thrashing in Linux , 2001, Annual Linux Showcase & Conference.

[11] Joel H. Saltz,et al. The utility of exploiting idle workstations for parallel computation , 1997, SIGMETRICS '97.

[12] Ian T. Foster,et al. Condor-G: A Computation Management Agent for Multi-Institutional Grids , 2004, Cluster Computing.

[13] Evgenia Smirni,et al. Algorithmic modifications to the Jacobi-Davidson parallel eigensolver to dynamically balance external CPU and memory load , 2001, ICS '01.

[14] Andrea C. Arpaci-Dusseau,et al. Implicit coscheduling: coordinated scheduling with implicit information in distributed systems , 2001, TOCS.

[15] John Zahorjan,et al. Scheduling memory constrained jobs on distributed memory parallel computers , 1995, SIGMETRICS '95/PERFORMANCE '95.

[16] John K. Ousterhout. Scheduling Techniques for Concurrebt Systems. , 1982, ICDCS 1982.

[17] Li Xiao,et al. Dynamic load sharing with unknown memory demands in clusters , 2001, Proceedings 21st International Conference on Distributed Computing Systems.

[18] Kenneth C. Sevcik,et al. Coordinated allocation of memory and processors in multiprocessors , 1996, SIGMETRICS '96.

[19] David A. Wood,et al. Paging tradeoffs in distributed-shared-memory multiprocessors , 1994, Supercomputing '94.