Performance-centric scheduling with task migration for a heterogeneous compute node in the data center

The use of heterogeneous computing resources, such as Graphic Processing Units or other specialized coprocessors, has become widespread in recent years because of their performance and energy efficiency advantages. Approaches for managing and scheduling tasks to heterogeneous resources are still subject to research. Although queuing systems have recently been extended to support accelerator resources, a general solution that manages heterogeneous resources at the operating system-level to exploit a global view of the system state is still missing. In this paper we present a user space scheduler that enables task scheduling and migration on heterogeneous processing resources in Linux. Using run queues for available resources we perform scheduling decisions based on the system state and on task characterization from earlier measurements. With a programming pattern that supports the integration of checkpoints into applications, we preempt tasks and migrate them between three very different compute resources. Considering static and dynamic workload scenarios, we show that this approach can gain up to 17% performance, on average 7%, by effectively avoiding idle resources. We demonstrate that a work-conserving strategy without migration is no suitable alternative.

[1]  Salim Hariri,et al.  Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing , 2002, IEEE Trans. Parallel Distributed Syst..

[2]  Hiroaki Kobayashi,et al.  CheCL: Transparent Checkpointing and Process Migration of OpenCL Applications , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[3]  Chita R. Das,et al.  Towards characterizing cloud backend workloads: insights from Google compute clusters , 2010, PERV.

[4]  Thomas Hérault,et al.  DAGuE: A Generic Distributed DAG Engine for High Performance Computing , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.

[5]  Kevin Skadron,et al.  Dynamic Heterogeneous Scheduling Decisions Using Historical Runtime Data , 2011 .

[6]  Michael F. P. O'Boyle,et al.  Smart multi-task scheduling for OpenCL programs on CPU/GPU heterogeneous platforms , 2014, 2014 21st International Conference on High Performance Computing (HiPC).

[7]  Bin Huang,et al.  Checkpoint/Restart and Beyond: Resilient High Performance Computing with FPGAs , 2011, 2011 IEEE 19th Annual International Symposium on Field-Programmable Custom Computing Machines.

[8]  Dean M. Tullsen,et al.  Harnessing ISA diversity: Design of a heterogeneous-ISA chip multiprocessor , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).

[9]  Sheng Di,et al.  Characterization and Comparison of Cloud versus Grid Workloads , 2012, 2012 IEEE International Conference on Cluster Computing.

[10]  Jason Duell,et al.  Berkeley Lab Checkpoint/Restart (BLCR) for Linux Clusters , 2006 .

[11]  Kyle Rupnow,et al.  Block, Drop or Roll(back): Alternative Preemption Methods for RH Multi-Tasking , 2009, 2009 17th IEEE Symposium on Field Programmable Custom Computing Machines.

[12]  Reinhard Männer,et al.  Multitasking on FPGA Coprocessors , 2000, FPL.

[13]  Ting Li,et al.  Hybrid CPU/GPU Checkpoint for GPU-Based Heterogeneous Systems , 2013, ParCo 2013.

[14]  André Brinkmann,et al.  Programming and Scheduling Model for Supporting Heterogeneous Accelerators in Linux , 2012 .

[15]  Christian Haubelt,et al.  Efficient hardware checkpointing: concepts, overhead analysis, and implementation , 2007, FPGA '07.

[16]  Jong-Myon Kim,et al.  An efficient scheduling scheme using estimated execution time for heterogeneous computing systems , 2013, The Journal of Supercomputing.

[17]  Dean M. Tullsen,et al.  Execution migration in a heterogeneous-ISA chip multiprocessor , 2012, ASPLOS XVII.

[18]  Cédric Augonnet,et al.  StarPU: a unified platform for task scheduling on heterogeneous multicore architectures , 2011, Concurr. Comput. Pract. Exp..