A High-Performance and Power-Efficient SIMD Convolution Engine for FPGAs

This paper presents a power-efficient hardware architecture designed to perform 3D convolution operations in modern high-performance FPGA-based video processors. The proposed design exploits a novel platform-independent Single-Instruction-Multiple-Data modality to process input data, thus allowing multiple levels of parallelism to be supported. For purposes of comparison with existing hardware accelerators, several implementations are characterized using different FPGA devices. When accommodated within low-end chips, such as the Xilinx XC7Z020, the proposed convolution engine exhibits a maximum clock frequency 22% higher than the competitors, without significantly increasing the dynamic energy consumption. When more advanced DSP slices are used, like those available in Xilinx Ultrascale FPGA devices the proposed convolution engine dissipates up to 40% less dynamic energy than existing counterparts, without significant speed performance penalties.

[1]  Martyn P. Nash,et al.  Suitability of recent hardware accelerators (DSPs, FPGAs, and GPUs) for computer vision and image processing algorithms , 2018, Signal Process. Image Commun..

[2]  Stefania Perri,et al.  Designing Fast Convolutional Engines for Deep Learning Applications , 2018, 2018 25th IEEE International Conference on Electronics, Circuits and Systems (ICECS).

[3]  Weiwei Zhang,et al.  Real-time vehicle type classification with deep convolutional neural networks , 2017, Journal of Real-Time Image Processing.

[4]  Heye Zhang,et al.  IoT-based 3D convolution for video salient object detection , 2019, Neural Computing and Applications.

[5]  Zhenyu Liu,et al.  High-Performance FPGA-Based CNN Accelerator With Block-Floating-Point Arithmetic , 2019, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[6]  Marco Lanuzza,et al.  A high-performance fully reconfigurable FPGA-based 2D convolution processor , 2005, Microprocess. Microsystems.

[7]  Jongeun Lee,et al.  Double MAC on a DSP: Boosting the Performance of Convolutional Neural Networks on FPGAs , 2019, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.