论文信息 - Workload Partitioning Strategy for Improved Parallelism on FPGA-CPU Heterogeneous Chips

Workload Partitioning Strategy for Improved Parallelism on FPGA-CPU Heterogeneous Chips

In heterogeneous computing, efficient parallelism can be obtained if every device runs the same task on a different portion of the data set. This requires designing a scheduler which assigns data chunks to compute units proportional to their throughputs. For FPGA-CPU heterogeneous devices, to provide the best possible overall throughput, a scheduler should accurately evaluate the different performance behaviour of the compute devices. In this article, we propose a scheduler which initially detects the highest throughput each device can obtain for a specific application with negligible overhead and then partitions the dataset for improved performance. To demonstrate the efficiency of this method, we choose a Zynq UltraScale+ ZCU102 device as the hardware target and parallelise four applications showing that the developed scheduler can provide up to 94.06% of the throughput achievable at an ideal condition, with comparable power and energy consumption.

[1] Rafael Asenjo,et al. Simultaneous multiprocessing in a software-defined heterogeneous FPGA , 2018, The Journal of Supercomputing.

[2] Rafael Asenjo,et al. Strategies for maximizing utilization on multi-CPU and multi-GPU heterogeneous architectures , 2014, The Journal of Supercomputing.

[3] Adrián Cristal,et al. An empirical evaluation of High-Level Synthesis languages and tools for database acceleration , 2014, 2014 24th International Conference on Field Programmable Logic and Applications (FPL).

[4] Pingfan Meng,et al. FPGA-GPU-CPU heterogenous architecture for real-time cardiac physiological optical mapping , 2012, 2012 International Conference on Field-Programmable Technology.

[5] Rafael Asenjo,et al. Adaptive Partitioning for Irregular Applications on Heterogeneous CPU-GPU Chips , 2015, ICCS.

[6] W. Luk,et al. Axel: a heterogeneous cluster with FPGAs and GPUs , 2010, FPGA '10.