Realistic Workload Scheduling Policies for Taming the Memory Bandwidth Bottleneck of SMPs

In this paper we reformulate the thread scheduling problem on multiprogrammed SMPs Scheduling algorithms usually attempt to maximize performance of memory intensive applications by optimally exploiting the cache hierarchy We present experimental results indicating that – contrary to the common belief – the extent of performance loss of memory-intensive, multiprogrammed workloads is disproportionate to the deterioration of cache performance caused by interference between threads In previous work [1] we found that memory bandwidth saturation is often the actual bottleneck that determines the performance of multiprogrammed workloads Therefore, we present and evaluate two realistic scheduling policies which treat memory bandwidth as a first-class resource Their design methodology is general enough and can be applied to introduce bus bandwidth-awareness to conventional scheduling policies Experimental results substantiate the advantages of our approach.

[1]  Eleftherios D. Polychronopoulos,et al.  A Tool to Schedule Parallel Applications on Multiprocessors: The NANOS CPU MANAGER , 2000, JSSPP.

[2]  Eleftherios D. Polychronopoulos,et al.  Kernel-level scheduling for the nano-threads programming model , 1998, ICS '98.

[3]  Xavier Martorell,et al.  NanosCompiler: A Research Platform for OpenMP Extensions , 1999 .

[4]  Mark S. Squillante,et al.  Using Processor-Cache Affinity Information in Shared-Memory Multiprocessor Scheduling , 1993, IEEE Trans. Parallel Distributed Syst..

[5]  Jesús Labarta,et al.  Performance-driven processor allocation , 2000, IEEE Transactions on Parallel and Distributed Systems.

[6]  Dimitrios S. Nikolopoulos,et al.  Scheduling algorithms with bus bandwidth considerations for SMPs , 2003, 2003 International Conference on Parallel Processing, 2003. Proceedings..

[7]  Raj Vaswani,et al.  The implications of cache affinity on processor scheduling for multiprogrammed, shared memory multiprocessors , 1991, SOSP '91.

[8]  Dean M. Tullsen,et al.  Symbiotic jobscheduling with priorities for a simultaneous multithreading processor , 2002, SIGMETRICS '02.

[9]  Josep Torrellas,et al.  Evaluating the Performance of Cache-Affinity Scheduling in Shared-Memory Multiprocessors , 1995, J. Parallel Distributed Comput..

[10]  Dean M. Tullsen,et al.  Symbiotic jobscheduling for a simultaneous mutlithreading processor , 2000, SIGP.

[11]  Thu D. Nguyen,et al.  Maximizing speedup through self-tuning of processor allocation , 1996, Proceedings of International Conference on Parallel Processing.