QL-Net: Quantized-by-LookUp CNN

Convolutional Neural Networks (CNNs) have achieved a state-of-the-art performance in the different computer vision tasks. However, CNN algorithms are computationally and power intensive, which makes them difficult to run on wearable and embedded systems. One way to address this constraint is to reduce the number of computational operations performed. Recently, several approaches addressed the problem of the computational complexity in the CNNs. Most of these methods, however, require a dedicated hardware. We propose a new method for the computation reduction in CNNs that substitutes Multiply and Accumulate (MAC) operations with a codebook lookup and can be executed on the generic hardware. The proposed method called QL-Net combines several concepts: (i) a codebook construction, (ii) a layer-wise retraining strategy, and (iii) a substitution of the MAC operations with the lookup of the convolution responses at inference time. The proposed QL-Net achieves a 98.6% accuracy on the MNIST dataset with a 5.8x reduction in runtime, when compared to MAC-based CNN model that achieved a 99.2% accuracy.

[1]  Yoshua Bengio,et al.  BinaryConnect: Training Deep Neural Networks with binary weights during propagations , 2015, NIPS.

[2]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[4]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[5]  Shuchang Zhou,et al.  DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients , 2016, ArXiv.

[6]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[7]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Song Han,et al.  Trained Ternary Quantization , 2016, ICLR.

[9]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[10]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[11]  Ran El-Yaniv,et al.  Binarized Neural Networks , 2016, ArXiv.

[12]  Yoshua Bengio,et al.  Neural Networks with Few Multiplications , 2015, ICLR.

[13]  Bin Liu,et al.  Ternary Weight Networks , 2016, ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[14]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[15]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[16]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Francesco Piazza,et al.  Fast neural networks without multipliers , 1993, IEEE Trans. Neural Networks.