论文信息 - E-ERA: An energy-efficient reconfigurable architecture for RNNs using dynamically adaptive approximate computing

E-ERA: An energy-efficient reconfigurable architecture for RNNs using dynamically adaptive approximate computing

This paper proposes an Energy-Efficient Reconfigurable Architecture (E-ERA) for Recurrent Neural Networks (RNNs). In E-ERA, reconfigurable computing arrays with approximate multipliers and dynamically adaptive accuracy controlling mechanism are implemented to achieve high energy efficiency. The E-ERA prototype is implemented on TSMC 45 nm process. Experimental results show that, comparing with traditional designs, the power consumption of E-ERA is reduced by 28.6%∼52.3%, with only 5.3%∼9.2% loss in accuracy. Compared with state-of-the-art architectures, E-ERA outperforms up to 1.78X in power efficiency and can achieve 304GOPS/W when processing RNNs for speech recognition.

[1] Lukás Burget,et al. Extensions of recurrent neural network language model , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[2] Dong Wang,et al. THCHS-30 : A Free Chinese Speech Corpus , 2015, ArXiv.

[3] Asit K. Mishra,et al. From high-level deep neural models to FPGAs , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[4] Yu Wang,et al. FPGA Acceleration of Recurrent Neural Network Based Language Model , 2015, 2015 IEEE 23rd Annual International Symposium on Field-Programmable Custom Computing Machines.

[5] Ninghui Sun,et al. DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning , 2014, ASPLOS.

[6] 张国亮,et al. Comparison of Different Implementations of MFCC , 2001 .

[7] Yajie Miao,et al. EESEN: End-to-end speech recognition using deep RNN models and WFST-based decoding , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).

[8] Marcin Pietras. Error analysis in the hardware neural networks applications using reduced floating-point numbers representation , 2015 .

[9] Berin Martini,et al. Recurrent Neural Networks Hardware Implementation on FPGA , 2015, ArXiv.

[10] Jia Wang,et al. DaDianNao: A Machine-Learning Supercomputer , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.

[11] Mark Horowitz,et al. 1.1 Computing's energy problem (and what we can do about it) , 2014, 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC).

[12] Navdeep Jaitly,et al. Hybrid speech recognition with Deep Bidirectional LSTM , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.

[13] Shaoli Liu,et al. Cambricon-X: An accelerator for sparse neural networks , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[14] Jun Yao,et al. A CGRA-Based Approach for Accelerating Convolutional Neural Networks , 2015, 2015 IEEE 9th International Symposium on Embedded Multicore/Many-core Systems-on-Chip.

[15] Song Han,et al. EIE: Efficient Inference Engine on Compressed Deep Neural Network , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[16] Shuicheng Yan,et al. Semantic Object Parsing with Graph LSTM , 2016, ECCV.

[17] Patricio Bulic,et al. An iterative logarithmic multiplier , 2011, Microprocess. Microsystems.