BD-NET: A Multiplication-Less DNN with Binarized Depthwise Separable Convolution

In this work, we propose a multiplication-less deep convolution neural network, called BD-NET. As far as we know, BD-NET is the first to use binarized depthwise separable convolution block as the drop-in replacement of conventional spatial-convolution in deep convolution neural network (CNN). In BD-NET, the computation-expensive convolution operations (i.e. Multiplication and Accumulation) are converted into hardware-friendly Addition/Subtraction operations. In this work, we first investigate and analyze the performance of BD-NET in terms of accuracy, parameter size and computation cost, w.r.t various network configurations. Then, the experiment results show that our proposed BD-NET with binarized depthwise separable convolution can achieve even higher inference accuracy to its baseline CNN counterpart with full-precision conventional convolution layer on the CIFAR-10 dataset. From the perspective of hardware implementation, the convolution layer of BD-NET achieves up to 97.2%, 88.9%, and 99.4% reduction in terms of computation energy, memory usage, and chip area respectively.

[1]  Igor Carron,et al.  XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016 .

[2]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Vishnu Naresh Boddeti,et al.  Local Binary Convolutional Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Shuchang Zhou,et al.  DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients , 2016, ArXiv.

[5]  François Chollet,et al.  Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Matthew Richardson,et al.  Do Deep Convolutional Nets Really Need to be Deep and Convolutional? , 2016, ICLR.

[7]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[8]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[9]  Wei Pan,et al.  Towards Accurate Binary Convolutional Neural Network , 2017, NIPS.

[10]  R. Srikant,et al.  Why Deep Neural Networks for Function Approximation? , 2016, ICLR.

[11]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[12]  James T. Kwok,et al.  Loss-aware Binarization of Deep Networks , 2016, ICLR.

[13]  Stéphane Mallat,et al.  Rigid-Motion Scattering for Texture Classification , 2014, ArXiv.

[14]  Yoshua Bengio,et al.  BinaryConnect: Training Deep Neural Networks with binary weights during propagations , 2015, NIPS.

[15]  Ran El-Yaniv,et al.  Binarized Neural Networks , 2016, NIPS.

[16]  Ran El-Yaniv,et al.  Binarized Neural Networks , 2016, ArXiv.

[17]  Norman P. Jouppi,et al.  CACTI 6.0: A Tool to Model Large Caches , 2009 .

[18]  Yoshua Bengio,et al.  Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.