论文信息 - PositNN Framework: Tapered Precision Deep Learning Inference for the Edge

PositNN Framework: Tapered Precision Deep Learning Inference for the Edge

The performance of neural networks, especially the currently popular deep neural networks, is often limited by the underlying hardware. Computations in deep neural networks are expensive, have a large memory footprint, and are power hungry. Conventional reduced-precision numerical formats, such as fixed-point and floating point, are not optimal to represent deep neural network parameters with a nonlinear distribution and small dynamic range. The recently proposed posit numerical format with tapered precision represents small values more accurately than the other formats. In this work, we propose a deep neural network framework, PositNN, that uses the posit numerical format and exact-dot-product operations during inference. The efficacy of the ultra-low precision version of PositNN as compared to other frameworks (which use fixed-point and floating point) is demonstrated on three datasets (MNIST, Fashion MNIST, and CIFAR-10), where an {5-8}-bit PositNN outperforms other {5-8}-bit low-precision neural networks across all tasks.

[1] Jason Cong,et al. Scaling for edge inference of deep neural networks , 2018 .

[2] Asit K. Mishra,et al. Apprentice: Using Knowledge Distillation Techniques To Improve Low-Precision Network Accuracy , 2017, ICLR.

[3] Christopher J. Shallue,et al. Identifying Exoplanets with Deep Learning: A Five-planet Resonant Chain around Kepler-80 and an Eighth Planet around Kepler-90 , 2017, 1712.05044.

[4] John L. Gustafson,et al. Beating Floating Point at its Own Game: Posit Arithmetic , 2017, Supercomput. Front. Innov..

[5] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[6] Gaurav S. Sukhatme,et al. Rover-IRL: Inverse Reinforcement Learning With Soft Value Iteration Networks for Planetary Rover Path Planning , 2018, IEEE Robotics and Automation Letters.

[7] Ulrich W. Kulisch,et al. Computer Arithmetic and Validity - Theory, Implementation, and Applications , 2008, de Gruyter studies in mathematics.

[8] John L. Gustafson,et al. Performance-Efficiency Trade-off of Low-Precision Numerical Formats in Deep Neural Networks , 2019, Proceedings of the Conference for Next Generation Arithmetic 2019.

[9] R. Fisher. THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[10] Asit K. Mishra,et al. WRPN & Apprentice: Methods for Training and Inference using Low-Precision Numerics , 2018, ArXiv.

[11] Forrest N. Iandola,et al. Exploring the Design Space of Deep Convolutional Neural Networks at Large Scale , 2016, ArXiv.

[12] Marian Verhelst,et al. Embedded Deep Neural Network Processing: Algorithmic and Processor Techniques Bring Deep Learning to IoT and Edge Devices , 2017, IEEE Solid-State Circuits Magazine.

[13] Xuehai Zhou,et al. PuDianNao: A Polyvalent Machine Learning Accelerator , 2015, ASPLOS.

[14] Daniel Brand,et al. MEC: Memory-efficient Convolution for Deep Neural Network , 2017, ICML.

[15] Jeff Johnson,et al. Rethinking floating point for deep learning , 2018, ArXiv.

[16] Bo Chen,et al. Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18] Dhireesha Kudithipudi,et al. Deep Learning Inference on Embedded Devices: Fixed-Point vs Posit , 2018, 2018 1st Workshop on Energy Efficient Machine Learning and Cognitive Computing for Embedded Applications (EMC2).

[19] Bradley Greig,et al. Deep learning from 21-cm tomography of the Cosmic Dawn and Reionization , 2019, Monthly Notices of the Royal Astronomical Society.

[20] Hari Angepat,et al. Serving DNNs in Real Time at Datacenter Scale with Project Brainwave , 2018, IEEE Micro.

[21] Geoffrey E. Hinton,et al. Deep Learning , 2015, Nature.

[22] Hassan Shojania,et al. A VLSI architecture for high performance CABAC encoding , 2005, Visual Communications and Image Processing.