VLSI Implementation of Deep Neural Network Using Integral Stochastic Computing

The hardware implementation of deep neural networks (DNNs) has recently received tremendous attention since many applications require high-speed operations. However, numerous processing elements and complex interconnections are usually required, leading to a large area occupation and a high power consumption. Stochastic computing has shown promising results for area-efficient hardware implementations, even though existing stochastic algorithms require long streams that exhibit long latency. In this paper, we propose an integer form of stochastic computation and introduce some elementary circuits. We then propose an efficient implementation of a DNN based on integral stochastic computing. The proposed architecture uses integer stochastic streams and a modified Finite State Machine-based tanh function to improve the performance and reduce the latency compared to existing stochastic architectures for DNN. The simulation results show the negligible performance loss of the proposed integer stochastic DNN for different network sizes compared to their floating point versions.

[1]  Keshab K. Parhi,et al.  Architectures for Recursive Digital Filters Using Stochastic Computing , 2016, IEEE Transactions on Signal Processing.

[2]  David J. Lilja,et al.  Using stochastic computing to implement digital image processing algorithms , 2011, 2011 IEEE 29th International Conference on Computer Design (ICCD).

[3]  Peng Li,et al.  A stochastic reconfigurable architecture for fault-tolerant computation with sequential logic , 2012, 2012 IEEE 30th International Conference on Computer Design (ICCD).

[4]  Zhengya Zhang,et al.  A Native Stochastic Computing Architecture Enabled by Memristors , 2014, IEEE Transactions on Nanotechnology.

[5]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[6]  Xiaogang Wang,et al.  Multi-stage Contextual Deep Learning for Pedestrian Detection , 2013, 2013 IEEE International Conference on Computer Vision.

[7]  Paul Chow,et al.  High-Performance Reconfigurable Hardware Architecture for Restricted Boltzmann Machines , 2010, IEEE Transactions on Neural Networks.

[8]  Brian R. Gaines,et al.  Stochastic Computing Systems , 1969 .

[9]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[10]  Antoni Morro,et al.  Hardware implementation of stochastic-based Neural Networks , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[11]  John P. Hayes,et al.  Stochastic circuits for real-time image-processing applications , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[12]  Hoi-Jun Yoo,et al.  4.6 A1.93TOPS/W scalable deep learning/inference processor with tetra-parallel MIMD architecture for big-data applications , 2015, 2015 IEEE International Solid-State Circuits Conference - (ISSCC) Digest of Technical Papers.

[13]  Pai-Shun Ting,et al.  Stochastic Logic Realization of Matrix Operations , 2014, 2014 17th Euromicro Conference on Digital System Design.

[14]  Lok-Won Kim,et al.  A Fully Pipelined FPGA Architecture of a Factored Restricted Boltzmann Machine Artificial Neural Network , 2014, TRETS.

[15]  W. E. Blanz,et al.  GANGLION-a fast hardware implementation of a connectionist classifier , 1991, Proceedings of the IEEE 1991 Custom Integrated Circuits Conference.

[16]  Xiaogang Wang,et al.  Switchable Deep Network for Pedestrian Detection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Naoya Onizawa,et al.  VLSI Implementation of Deep Neural Network Using Integral Stochastic Computing , 2017, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[18]  Marcin Skubiszewski,et al.  An exact hardware implementation of the Boltzmann machine , 1992, [1992] Proceedings of the Fourth IEEE Symposium on Parallel and Distributed Processing.

[19]  Paul Chow,et al.  A multi-FPGA architecture for stochastic Restricted Boltzmann Machines , 2009, 2009 International Conference on Field Programmable Logic and Applications.

[20]  Keshab K. Parhi,et al.  Successive cancellation decoding of polar codes using stochastic computing , 2015, 2015 IEEE International Symposium on Circuits and Systems (ISCAS).

[21]  Kia Bazargan,et al.  Computation on Stochastic Bit Streams Digital Image Processing Case Studies , 2014, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[22]  David J. Lilja,et al.  An FPGA implementation of a Restricted Boltzmann Machine classifier using stochastic bit streams , 2015, 2015 IEEE 26th International Conference on Application-specific Systems, Architectures and Processors (ASAP).

[23]  Howard C. Card,et al.  Stochastic arithmetic implementations of neural networks with in situ learning , 1993, IEEE International Conference on Neural Networks.

[24]  Frank R. Kschischang,et al.  Modeling and Energy Optimization of LDPC Decoder Circuits With Timing Violations , 2018, IEEE Trans. Commun..

[25]  Kunle Olukotun,et al.  A highly scalable Restricted Boltzmann Machine FPGA implementation , 2009, 2009 International Conference on Field Programmable Logic and Applications.

[26]  Michael I. Jordan,et al.  The Handbook of Brain Theory and Neural Networks , 2002 .

[27]  D BrownBradley,et al.  Stochastic Neural Computation II , 2001 .

[28]  Michael A. Arbib,et al.  The handbook of brain theory and neural networks , 1995, A Bradford book.

[29]  John Shawe-Taylor,et al.  Stochastic connection neural networks , 1995 .

[30]  Howard C. Card,et al.  Stochastic Neural Computation I: Computational Elements , 2001, IEEE Trans. Computers.

[31]  Masatoshi Okutomi,et al.  A Novel Inference of a Restricted Boltzmann Machine , 2014, 2014 22nd International Conference on Pattern Recognition.

[32]  Josep L. Rosselló,et al.  A New Stochastic Computing Methodology for Efficient Neural Network Implementation , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[33]  Dong Yu,et al.  Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[34]  Feng Ran,et al.  A hardware implementation of a radial basis function neural network using stochastic logic , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[35]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[36]  Weikang Qian,et al.  An ultra-fast parallel architecture using sequential circuits computing on random bits , 2013, 2013 IEEE International Symposium on Circuits and Systems (ISCAS2013).

[37]  Xin Li,et al.  An Architecture for Fault-Tolerant Computation with Stochastic Logic , 2011, IEEE Transactions on Computers.

[38]  Shie Mannor,et al.  Fully Parallel Stochastic LDPC Decoders , 2008, IEEE Transactions on Signal Processing.