Dynamic load balancing on multi-GPUs system for big data processing

The powerful parallel computing capability of modern GPU (Graphics Processing Unit) processors has attracted increasing attentions of researchers and engineers who had conducted a large number of GPU-based acceleration research projects. However, current single GPU based solutions are still incapable of fulfilling the real-time computational requirements from the latest big data applications. Thus, the multi-GPU solution has become a trend for many real-time application attempts. In those cases, the computational load balancing over the multiple GPU nodes is often the key bottleneck that needs to be further studied to ensure the best possible performance. The existing load balancing approaches are mainly based on the assumption that all GPUs in the same system provide equal computational performance, and had fallen short to address the situations from heterogeneous multi-GPU systems. This paper presents a novel dynamic load balancing model for heterogeneous multi-GPU systems based on the fuzzy neural network (FNN) framework. The devised model has been implemented and demonstrated in a case study for improving the computational performance of a two dimensional (2D) discrete wavelet transform (DWT). Experiment results show that this dynamic load balancing model has enabled a high computational throughput that can satisfy the real-time and accuracy requirements from many big data processing applications.

[1]  Sang-Hong Lee,et al.  Forecasting KOSPI based on a neural network with weighted fuzzy membership functions , 2011, Expert Syst. Appl..

[2]  R. J. Kuo,et al.  Integration of particle swarm optimization-based fuzzy neural network and artificial neural network for supplier selection , 2010 .

[3]  Guo-Xing Wen,et al.  Fuzzy Neural Network-Based Adaptive Control for a Class of Uncertain Nonlinear Stochastic Systems , 2014, IEEE Transactions on Cybernetics.

[4]  Jie Cheng,et al.  Programming Massively Parallel Processors. A Hands-on Approach , 2010, Scalable Comput. Pract. Exp..

[5]  Jose A. Belloch,et al.  Real-time massive convolution for audio applications on GPU , 2011, The Journal of Supercomputing.

[6]  Tae-Young Choe,et al.  Two-way partitioning of a recursive Gaussian filter in CUDA , 2014, EURASIP J. Image Video Process..

[7]  Kevin Skadron,et al.  Load balancing in a changing world: dealing with heterogeneity and performance variability , 2013, CF '13.

[8]  Francisco Almeida,et al.  Dynamic load balancing on heterogeneous multicore/multiGPU systems , 2010, 2010 International Conference on High Performance Computing & Simulation.

[9]  Wim Sweldens,et al.  The lifting scheme: a construction of second generation wavelets , 1998 .

[10]  William J. Dally,et al.  GPUs and the Future of Parallel Computing , 2011, IEEE Micro.

[11]  Long Chen,et al.  Dynamic load balancing on single- and multi-GPU systems , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[12]  Gernot A. Fink,et al.  Face Detection Using GPU-Based Convolutional Neural Networks , 2009, CAIP.