MASR: A Modular Accelerator for Sparse RNNs
暂无分享,去创建一个
Alexander M. Rush | Gu-Yeon Wei | David Brooks | Brandon Reagen | Udit Gupta | Lillian Pentecost | Marco Donato | Thierry Tambe | Brandon Reagen | Gu-Yeon Wei | D. Brooks | Udit Gupta | Lillian Pentecost | M. Donato | Thierry Tambe
[1] Song Han,et al. EIE: Efficient Inference Engine on Compressed Deep Neural Network , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[2] Hairong Liu,et al. Exploring neural transducers for end-to-end speech recognition , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[3] Andrew W. Senior,et al. Fast and accurate recurrent neural network acoustic models for speech recognition , 2015, INTERSPEECH.
[4] Erich Elsen,et al. Exploring Sparsity in Recurrent Neural Networks , 2017, ICLR.
[5] Yurong Chen,et al. Dynamic Network Surgery for Efficient DNNs , 2016, NIPS.
[6] Tobi Delbrück,et al. DeltaRNN: A Power-efficient Recurrent Neural Network Accelerator , 2018, FPGA.
[7] Miao Hu,et al. ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[8] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.
[9] Alexander M. Rush,et al. Weightless: Lossy Weight Encoding For Deep Neural Network Compression , 2018, ICML.
[10] Patrick Judd,et al. Stripes: Bit-serial deep neural network computing , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[11] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] Colin Raffel,et al. Monotonic Chunkwise Attention , 2017, ICLR.
[13] Martin D. Schatz,et al. Deep Learning Inference in Facebook Data Centers: Characterization, Performance Optimizations and Hardware Implications , 2018, ArXiv.
[14] Chong Wang,et al. Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin , 2015, ICML.
[15] Rajesh K. Gupta,et al. SnaPEA: Predictive Early Activation for Reducing Computation in Deep Convolutional Neural Networks , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[16] Wenyao Xu,et al. E-RNN: Design Optimization for Efficient Recurrent Neural Networks in FPGAs , 2018, 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[17] Eric S. Chung,et al. A Configurable Cloud-Scale DNN Processor for Real-Time AI , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[18] Lidong Zhou,et al. Astra: Exploiting Predictability to Optimize Deep Learning , 2019, ASPLOS.
[19] Berin Martini,et al. Recurrent Neural Networks Hardware Implementation on FPGA , 2015, ArXiv.
[20] Chao Wang,et al. CirCNN: Accelerating and Compressing Deep Neural Networks Using Block-Circulant Weight Matrices , 2017, 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[21] Stephen W. Keckler,et al. Compressing DMA Engine: Leveraging Activation Sparsity for Training Deep Neural Networks , 2017, 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[22] Hadi Esmaeilzadeh,et al. Bit Fusion: Bit-Level Dynamically Composable Architecture for Accelerating Deep Neural Network , 2017, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[23] Jose-Maria Arnau,et al. UNFOLD: A Memory-Efficient Speech Recognizer Using On-The-Fly WFST Composition , 2017, 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[24] William J. Dally,et al. SCNN: An accelerator for compressed-sparse convolutional neural networks , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[25] David M. Brooks,et al. Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective , 2018, 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[26] Jason Cong,et al. FPGA-based accelerator for long short-term memory recurrent neural networks , 2017, 2017 22nd Asia and South Pacific Design Automation Conference (ASP-DAC).
[27] Tara N. Sainath,et al. Streaming End-to-end Speech Recognition for Mobile Devices , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[28] Tao Zhang,et al. PRIME: A Novel Processing-in-Memory Architecture for Neural Network Computation in ReRAM-Based Main Memory , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[29] Song Han,et al. ESE: Efficient Speech Recognition Engine with Compressed LSTM on FPGA , 2016, ArXiv.
[30] H. T. Kung,et al. Packing Sparse Convolutional Neural Networks for Efficient Systolic Array Implementations: Column Combining Under Joint Optimization , 2018, ASPLOS.
[31] Jing Wang,et al. Towards Memory Friendly Long-Short Term Memory Networks (LSTMs) on Mobile GPUs , 2018, 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[32] Max Welling,et al. Soft Weight-Sharing for Neural Network Compression , 2017, ICLR.
[33] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.
[34] Christoforos E. Kozyrakis,et al. TANGRAM: Optimized Coarse-Grained Dataflow for Scalable NN Accelerators , 2019, ASPLOS.
[35] Erich Elsen,et al. Deep Speech: Scaling up end-to-end speech recognition , 2014, ArXiv.
[36] Shaoli Liu,et al. Cambricon-X: An accelerator for sparse neural networks , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[37] Tian Jin,et al. Split-CNN: Splitting Window-based Operations in Convolutional Neural Networks for Memory System Optimization , 2019, ASPLOS.
[38] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.
[39] Jose-Maria Arnau,et al. E-PUR: an energy-efficient processing unit for recurrent neural networks , 2017, PACT.
[40] Jia Wang,et al. DaDianNao: A Machine-Learning Supercomputer , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.
[41] Andrew W. Senior,et al. Long short-term memory recurrent neural network architectures for large scale acoustic modeling , 2014, INTERSPEECH.
[42] Hyoukjun Kwon,et al. MAERI: Enabling Flexible Dataflow Mapping over DNN Accelerators via Reconfigurable Interconnects , 2018, ASPLOS.
[43] Natalie D. Enright Jerger,et al. Cnvlutin: Ineffectual-Neuron-Free Deep Neural Network Computing , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[44] Pradeep Dubey,et al. SCALEDEEP: A scalable compute architecture for learning and evaluating deep networks , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[45] Geoffrey E. Hinton,et al. Acoustic Modeling Using Deep Belief Networks , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[46] Jiayu Li,et al. ADMM-NN: An Algorithm-Hardware Co-Design Framework of DNNs Using Alternating Direction Methods of Multipliers , 2018, ASPLOS.
[47] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[48] David A. Patterson,et al. In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[49] Gu-Yeon Wei,et al. Fathom: reference workloads for modern deep learning methods , 2016, 2016 IEEE International Symposium on Workload Characterization (IISWC).
[50] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[51] Tao Li,et al. Prediction Based Execution on Deep Neural Networks , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[52] Jose-Maria Arnau,et al. The Dark Side of DNN Pruning , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[53] Tara N. Sainath,et al. FUNDAMENTAL TECHNOLOGIES IN MODERN SPEECH RECOGNITION Digital Object Identifier 10.1109/MSP.2012.2205597 , 2012 .
[54] Chi-Ying Tsui,et al. SparseNN: An energy-efficient neural network accelerator exploiting input and output sparsity , 2017, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[55] Vivienne Sze,et al. Efficient Processing of Deep Neural Networks: A Tutorial and Survey , 2017, Proceedings of the IEEE.
[56] Ninghui Sun,et al. DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning , 2014, ASPLOS.
[57] Gu-Yeon Wei,et al. Minerva: Enabling Low-Power, Highly-Accurate Deep Neural Network Accelerators , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).