Bridge deep learning to the physical world: An efficient method to quantize network

As better performance is achieved by deep convolutional network with more and more layers, the increasing number of weighting and bias parameters makes it only possible to be implemented on servers in cyber space but infeasible to be deployed in physical-world embedded systems because of huge storage and memory bandwidth requirements. In this paper, we proposed an efficient method to quantize the model parameters. Instead of taking the quantization process as a negative effect on precision, we regarded it as a regularize problem to prevent overfitting, and a two-stage quantization technique including soft- and hard-quantization is developed. With the help of our quantization method, not only 93.75% of the parameter memory size can be reduced by replacing the word length from 32-bit to 2-bit, but the testing accuracy after quantization is also better than previous approaches in some dataset, and the additional training overhead is only 3% of the ordinary one.

[1]  Eugenio Culurciello,et al.  Flattened Convolutional Neural Networks for Feedforward Acceleration , 2014, ICLR.

[2]  Jonathan G. Fiscus,et al.  Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .

[3]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Carla Teixeira Lopes,et al.  TIMIT Acoustic-Phonetic Continuous Speech Corpus , 2012 .

[5]  Kyuyeon Hwang,et al.  Fixed-point feedforward deep neural network design using weights +1, 0, and −1 , 2014, 2014 IEEE Workshop on Signal Processing Systems (SiPS).

[6]  Michael A. Arbib,et al.  The handbook of brain theory and neural networks , 1995, A Bradford book.

[7]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[8]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[9]  Berin Martini,et al.  A 240 G-ops/s Mobile Coprocessor for Deep Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[10]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[11]  Michael I. Jordan,et al.  The Handbook of Brain Theory and Neural Networks , 2002 .

[12]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[13]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[14]  Clément Farabet,et al.  Torch7: A Matlab-like Environment for Machine Learning , 2011, NIPS 2011.

[15]  A. Ng Feature selection, L1 vs. L2 regularization, and rotational invariance , 2004, Twenty-first international conference on Machine learning - ICML '04.

[16]  Xiaogang Wang,et al.  Multi-stage Contextual Deep Learning for Pedestrian Detection , 2013, 2013 IEEE International Conference on Computer Vision.

[17]  Xiang Zhang,et al.  Text Understanding from Scratch , 2015, ArXiv.

[18]  Camille Couprie,et al.  Learning Hierarchical Features for Scene Labeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[20]  Yen-Kuang Chen,et al.  Challenges and opportunities of internet of things , 2012, 17th Asia and South Pacific Design Automation Conference.

[21]  Yann LeCun,et al.  Pedestrian Detection with Unsupervised Multi-stage Feature Learning , 2012, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[23]  Xiaogang Wang,et al.  Switchable Deep Network for Pedestrian Detection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[25]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[26]  Geoffrey E. Hinton,et al.  Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models , 2014, INTERSPEECH.

[27]  Misha Denil,et al.  Predicting Parameters in Deep Learning , 2014 .

[28]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[29]  Chia-han Lee,et al.  Distributed computing in IoT: System-on-a-chip for smart cameras as an example , 2015, The 20th Asia and South Pacific Design Automation Conference.