On the Dynamic Scheduling of Task Farm Patterns on a Heterogeneous CPU-GPGPU Environment

Heterogeneous clusters and multi-core environments are gradually surpassing the homogeneous systems due to their high performance and flexibility. Task scheduling in these systems is an extensively studied subject. However, in a heterogeneous architecture consisting of multi-core CPUs and many-core GPGPUs (General Purpose Graphics Processor Units), task mapping becomes much more complex due to differences in architectures and programming models among the processors. Consequently, designing a scheduler which facilities a balanced distribution of loads by taking full advantage of the processing power of a CPU-GPGPU system is nontrivial. In this paper we discuss a multi-round scheduling algorithm and a scheduling framework for farm-pattern based applications on such a system. This is an important step towards designing a full-scale pattern-based scheduler to automatically and efficiently map the parallel tasks to the heterogeneous processors. As a proof of concept, we have designed a scheduling framework for the task-farm pattern based applications. The framework provides the necessary "separation of concerns" and hides the underlying complex scheduling details from the programmer. The experimental results demonstrate that our dynamic scheduling algorithm achieves better to similar performances as compared to some of the well-known scheduling algorithms for CPU-GPGPU systems.

[1]  Hyesoon Kim,et al.  Qilin: Exploiting parallelism on heterogeneous multiprocessors with adaptive mapping , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[2]  Alexander Aiken,et al.  Optimal loop parallelization , 1988, PLDI '88.

[3]  Timothy G. Mattson,et al.  Patterns for parallel programming , 2004 .

[4]  Ladislau Bölöni,et al.  A Comparison of Eleven Static Heuristics for Mapping a Class of Independent Tasks onto Heterogeneous Distributed Computing Systems , 2001, J. Parallel Distributed Comput..

[5]  Oscar H. Ibarra,et al.  Heuristic Algorithms for Scheduling Independent Tasks on Nonidentical Processors , 1977, JACM.

[6]  R. F. Freund,et al.  Dynamic Mapping of a Class of Independent Tasks onto Heterogeneous Computing Systems , 1999, J. Parallel Distributed Comput..

[7]  Vikram K. Narayana,et al.  A Static Task Scheduling Framework for Independent Tasks Accelerated Using a Shared Graphics Processing Unit , 2011, 2011 IEEE 17th International Conference on Parallel and Distributed Systems.

[8]  Horacio González-Vélez,et al.  Adaptive structured parallelism for distributed heterogeneous architectures: a methodological approach with pipelines and farms , 2010, Concurr. Comput. Pract. Exp..

[9]  Bronis R. de Supinski,et al.  Heterogeneous Task Scheduling for Accelerated OpenMP , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.

[10]  Marco Danelutto Adaptive task farm implementation strategies , 2004, 12th Euromicro Conference on Parallel, Distributed and Network-Based Processing, 2004. Proceedings..

[11]  Michael F. P. O'Boyle,et al.  A Static Task Partitioning Approach for Heterogeneous Systems Using OpenCL , 2011, CC.

[12]  Henri Casanova,et al.  Multiround algorithms for scheduling divisible loads , 2005, IEEE Transactions on Parallel and Distributed Systems.

[13]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[14]  Grigori Fursin,et al.  Predictive Runtime Code Scheduling for Heterogeneous Architectures , 2008, HiPEAC.

[15]  Alexander A. Stepanov,et al.  C++ Standard Template Library , 2000 .

[16]  Gagan Agrawal,et al.  Compiler and runtime support for enabling generalized reduction computations on heterogeneous parallel configurations , 2010, ICS '10.

[17]  Sanjay Ghemawat,et al.  MapReduce: simplified data processing on large clusters , 2008, CACM.

[18]  Cédric Augonnet,et al.  StarPU: a unified platform for task scheduling on heterogeneous multicore architectures , 2011, Concurr. Comput. Pract. Exp..

[19]  J. Xu OpenCL – The Open Standard for Parallel Programming of Heterogeneous Systems , 2009 .

[20]  Thomas G. Robertazzi,et al.  Divisible Load Scheduling for Grid Computing , 2003 .

[21]  Debasish Ghose,et al.  Divisible Load Theory: A New Paradigm for Load Scheduling in Distributed Systems , 2004, Cluster Computing.