Adaptation of Convolution and Batch Normalization Layer for CNN Implementation on FPGA

The article presents integration process of convolution and batch normalization layer for further implementation on FPGA. The convolution kernel is binarized and merged with batch normalization into a core and implemented on single DSP. The concept is proven on custom binarized convolutional neural network (CNN) that is trained in Matlab to solve object localization task. 16 b precision gives 1.3 % error on the output of joined convolution and batch normalization core. The localization accuracy decreases in average by 7 % from 74 % to 67 %, and it is still tolerable in embedded systems applications.

[1]  Ran El-Yaniv,et al.  Binarized Neural Networks , 2016, NIPS.

[2]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[3]  Jason Cong,et al.  Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks , 2015, FPGA.

[4]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Wayne Luk,et al.  Optimizing CNN-Based Object Detection Algorithms on Embedded FPGA Platforms , 2017, ARC.

[6]  Hiroki Nakahara,et al.  A Lightweight YOLOv2: A Binarized CNN with A Parallel Support Vector Regression for an FPGA , 2018, FPGA.

[7]  Xuegong Zhou,et al.  Accelerating low bit-width convolutional neural networks with embedded FPGA , 2017, 2017 27th International Conference on Field Programmable Logic and Applications (FPL).

[8]  Philip Heng Wai Leong,et al.  FINN: A Framework for Fast, Scalable Binarized Neural Network Inference , 2016, FPGA.

[9]  Hiroki Nakahara,et al.  On-Chip Memory Based Binarized Convolutional Deep Neural Network Applying Batch Normalization Free Technique on an FPGA , 2017, 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[10]  Jishen Zhao,et al.  Towards Fast and Energy-Efficient Binarized Neural Network Inference on FPGA , 2018, FPGA.

[11]  Yap June Wai,et al.  Fixed Point Implementation of Tiny-Yolo-v2 using OpenCL on FPGA , 2018 .

[12]  Yu Wang,et al.  Angel-Eye: A Complete Design Flow for Mapping CNN Onto Embedded FPGA , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[13]  Wayne Luk,et al.  FP-BNN: Binarized neural network on FPGA , 2018, Neurocomputing.

[14]  Tomyslav Sledevič,et al.  The Application of Convolutional Neural Network for Pollen Bearing Bee Classification , 2018, 2018 IEEE 6th Workshop on Advances in Information, Electronic and Electrical Engineering (AIEEE).

[15]  Eldar abanoviè,et al.  Deep Neural Network-based Feature Descriptor for Retinal Image Registration , 2018, 2018 IEEE 6th Workshop on Advances in Information, Electronic and Electrical Engineering (AIEEE).

[16]  Yu Wang,et al.  Software-Hardware Codesign for Efficient Neural Network Acceleration , 2017, IEEE Micro.

[17]  Igor Carron,et al.  XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016 .

[18]  Hiroki Nakahara,et al.  An object detector based on multiscale sliding window search using a fully pipelined binarized CNN on an FPGA , 2017, 2017 International Conference on Field Programmable Technology (ICFPT).