Efficient Workload Balancing on Heterogeneous GPUs using Mixed-Integer Non-Linear Programming

Recently, heterogeneous system architectures are becoming mainstream for achieving high performance and power efficiency. In particular, many-core graphics processing units (GPUs) now play an important role for computing in heterogeneous architectures. However, for application designers, computational workload still needs to be distributed to heterogeneous GPUs manually and remains inefficient. In this paper, we propose a mixed integer non-linear programming (MINLP) based method for efficient workload distribution on heterogeneous GPUs by considering asymmetric capabilities of GPUs for various applications. Compared to the previous methods, the experimental results show that our proposed method improves performance and balance up to 33% and 116%, respectively. Moreover, our method only requires a few overhead while achieving high performance and load balancing.

[1]  Hyesoon Kim,et al.  Qilin: Exploiting parallelism on heterogeneous multiprocessors with adaptive mapping , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[2]  Grigori Fursin,et al.  Predictive Runtime Code Scheduling for Heterogeneous Architectures , 2008, HiPEAC.

[3]  Francisco Almeida,et al.  Dynamic Load Balancing on Dedicated Heterogeneous Systems , 2008, PVM/MPI.

[4]  Carlos Eduardo Pereira,et al.  Towards dynamic reconfigurable load-balancing for hybrid desktop platforms , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).

[5]  Jesús Antonio Álvarez-Cedillo,et al.  Comparative Study of Parallel Variants for a Particle Swarm Optimization , 2009 .

[6]  Kevin Skadron,et al.  Rodinia: A benchmark suite for heterogeneous computing , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).

[7]  Alexey L. Lastovetsky,et al.  Dynamic Load Balancing of Parallel Computational Iterative Routines on Highly Heterogeneous HPC Platforms , 2011, Parallel Process. Lett..

[8]  Juan C. Cuevas-Tello,et al.  Parallel Approach for Time Series Analysis with General Regression Neural Networks , 2012 .

[9]  Soraya Ghiasi,et al.  Scheduling for heterogeneous processors in server systems , 2005, CF '05.

[10]  Cédric Augonnet,et al.  StarPU: a unified platform for task scheduling on heterogeneous multicore architectures , 2011, Concurr. Comput. Pract. Exp..

[11]  John E. Stone,et al.  OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems , 2010, Computing in Science & Engineering.

[12]  David A. Bader,et al.  A Waterfall Model to Achieve Energy Efficient Tasks Mapping for Large Scale GPU Clusters , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.