Logic Synthesis of Binarized Neural Networks for Efficient Circuit Implementation

Neural networks (NNs) are key to deep learning systems. Their efficient hardware implementation is crucial to applications at the edge. Binarized NNs (BNNs), where the weights and output of a neuron are of binary values {–1, +1} (or encoded in {0, 1}), have been proposed recently. As no multiplier is required, they are particularly attractive and suitable for hardware realization. Most prior NN synthesis methods target on hardware architectures with neural processing elements (NPEs), where the weights of a neuron are loaded and the output of the neuron is computed. The load-and-compute method, though area efficient, requires expensive memory access, which deteriorates energy and performance efficiency. In this work we aim at synthesizing BNN dense layers into dedicated logic circuits. We formulate the corresponding matrix covering problem and propose a scalable algorithm to reduce the area and routing cost of BNNs. Experimental results justify the effectiveness of the method in terms of area and net savings on FPGA implementation. Our method provides an alternative implementation of BNNs, and can be applied in combination with NPE-based implementation for area, speed, and power tradeoffs.

[1]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[2]  Mark Horowitz,et al.  1.1 Computing's energy problem (and what we can do about it) , 2014, 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC).

[3]  Rajesh Gupta,et al.  Accelerating Binarized Convolutional Neural Networks with Software-Programmable FPGAs , 2017, FPGA.

[4]  Ran El-Yaniv,et al.  Binarized Neural Networks , 2016, NIPS.

[5]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[6]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[7]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[8]  Philip Heng Wai Leong,et al.  FINN: A Framework for Fast, Scalable Binarized Neural Network Inference , 2016, FPGA.

[9]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[10]  Alberto Sangiovanni-Vincentelli,et al.  Logic synthesis for vlsi design , 1989 .

[11]  Philip Heng Wai Leong,et al.  Scaling Binarized Neural Networks on Reconfigurable Logic , 2017, PARMA-DITAM '17.

[12]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Nils J. Nilsson,et al.  A Formal Basis for the Heuristic Determination of Minimum Cost Paths , 1968, IEEE Trans. Syst. Sci. Cybern..

[14]  Frédéric Pétrot,et al.  Ternary neural networks for resource-efficient AI applications , 2016, 2017 International Joint Conference on Neural Networks (IJCNN).

[15]  Hiroki Nakahara,et al.  On-Chip Memory Based Binarized Convolutional Deep Neural Network Applying Batch Normalization Free Technique on an FPGA , 2017, 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[16]  Ali Farhadi,et al.  XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016, ECCV.

[17]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[18]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).