Training deep neural networks with low precision multiplications

Multipliers are the most space and power-hungry arithmetic operators of the digital implementation of deep neural networks. We train a set of state-of-the-art neural networks (Maxout networks) on three benchmark datasets: MNIST, CIFAR-10 and SVHN. They are trained with three distinct formats: floating point, fixed point and dynamic fixed point. For each of those datasets and for each of those formats, we assess the impact of the precision of the multiplications on the final error after training. We find that very low precision is sufficient not just for running trained networks but also for training them. For example, it is possible to train Maxout networks with 10 bits multiplications.

[1]  J. L. Holt,et al.  Back propagation simulations using limited precision calculations , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[2]  Patrice Y. Simard,et al.  Backpropagation without Multiplication , 1993, NIPS.

[3]  R. L. Haggard,et al.  A fixed point implementation of the backpropagation learning algorithm , 1994, Proceedings of SOUTHEASTCON '94.

[4]  Brian Kingsbury,et al.  Spert-II: A Vector Microprocessor System , 1996, Computer.

[5]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[6]  Adi Shraibman,et al.  Rank, Trace-Norm and Max-Norm , 2005, COLT.

[7]  Kassem Kalach,et al.  Hardware Complexity of Modular Multiplication and Exponentiation , 2007, IEEE Transactions on Computers.

[8]  Shawki Areibi,et al.  The Impact of Arithmetic Representation on Implementing MLP-BP on FPGAs: A Study , 2007, IEEE Transactions on Neural Networks.

[9]  Jusung Park,et al.  Design and implementation of 16-bit fixed point digital signal processor , 2008, 2008 International SoC Design Conference.

[10]  Kunle Olukotun,et al.  A highly scalable Restricted Boltzmann Machine FPGA implementation , 2009, 2009 International Conference on Field Programmable Logic and Applications.

[11]  Quoc V. Le,et al.  Scalable learning for object detection with GPU hardware , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[12]  Yann LeCun,et al.  What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[13]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[14]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[15]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[16]  Berin Martini,et al.  NeuFlow: A runtime reconfigurable dataflow processor for vision , 2011, CVPR 2011 WORKSHOPS.

[17]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[18]  Vincent Vanhoucke,et al.  Improving the speed of neural networks on CPUs , 2011 .

[19]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[20]  Marc'Aurelio Ranzato,et al.  Large Scale Distributed Deep Networks , 2012, NIPS.

[21]  Razvan Pascanu,et al.  Theano: new features and speed improvements , 2012, ArXiv.

[22]  E. Culurciello,et al.  NeuFlow: Dataflow vision processing system-on-a-chip , 2012, 2012 IEEE 55th International Midwest Symposium on Circuits and Systems (MWSCAS).

[23]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[24]  Rob Fergus,et al.  Stochastic Pooling for Regularization of Deep Convolutional Neural Networks , 2013, ICLR.

[25]  Ian J. Goodfellow,et al.  Pylearn2: a machine learning research library , 2013, ArXiv.

[26]  Yoshua Bengio,et al.  Maxout Networks , 2013, ICML.

[27]  Ninghui Sun,et al.  DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning , 2014, ASPLOS.

[28]  Jia Wang,et al.  DaDianNao: A Machine-Learning Supercomputer , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.

[29]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[30]  Pritish Narayanan,et al.  Deep Learning with Limited Numerical Precision , 2015, ICML.