E-ERA: An energy-efficient reconfigurable architecture for RNNs using dynamically adaptive approximate computing

This paper proposes an Energy-Efficient Reconfigurable Architecture (E-ERA) for Recurrent Neural Networks (RNNs). In E-ERA, reconfigurable computing arrays with approximate multipliers and dynamically adaptive accuracy controlling mechanism are implemented to achieve high energy efficiency. The E-ERA prototype is implemented on TSMC 45 nm process. Experimental results show that, comparing with traditional designs, the power consumption of E-ERA is reduced by 28.6%∼52.3%, with only 5.3%∼9.2% loss in accuracy. Compared with state-of-the-art architectures, E-ERA outperforms up to 1.78X in power efficiency and can achieve 304GOPS/W when processing RNNs for speech recognition.

[1]  Lukás Burget,et al.  Extensions of recurrent neural network language model , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[2]  Dong Wang,et al.  THCHS-30 : A Free Chinese Speech Corpus , 2015, ArXiv.

[3]  Asit K. Mishra,et al.  From high-level deep neural models to FPGAs , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[4]  Yu Wang,et al.  FPGA Acceleration of Recurrent Neural Network Based Language Model , 2015, 2015 IEEE 23rd Annual International Symposium on Field-Programmable Custom Computing Machines.

[5]  Ninghui Sun,et al.  DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning , 2014, ASPLOS.

[6]  张国亮,et al.  Comparison of Different Implementations of MFCC , 2001 .

[7]  Yajie Miao,et al.  EESEN: End-to-end speech recognition using deep RNN models and WFST-based decoding , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).

[8]  Marcin Pietras Error analysis in the hardware neural networks applications using reduced floating-point numbers representation , 2015 .

[9]  Berin Martini,et al.  Recurrent Neural Networks Hardware Implementation on FPGA , 2015, ArXiv.

[10]  Jia Wang,et al.  DaDianNao: A Machine-Learning Supercomputer , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.

[11]  Mark Horowitz,et al.  1.1 Computing's energy problem (and what we can do about it) , 2014, 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC).

[12]  Navdeep Jaitly,et al.  Hybrid speech recognition with Deep Bidirectional LSTM , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.

[13]  Shaoli Liu,et al.  Cambricon-X: An accelerator for sparse neural networks , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[14]  Jun Yao,et al.  A CGRA-Based Approach for Accelerating Convolutional Neural Networks , 2015, 2015 IEEE 9th International Symposium on Embedded Multicore/Many-core Systems-on-Chip.

[15]  Song Han,et al.  EIE: Efficient Inference Engine on Compressed Deep Neural Network , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[16]  Shuicheng Yan,et al.  Semantic Object Parsing with Graph LSTM , 2016, ECCV.

[17]  Patricio Bulic,et al.  An iterative logarithmic multiplier , 2011, Microprocess. Microsystems.