Algorithm-Hardware Co-Design of Adaptive Floating-Point Encodings for Resilient Deep Learning Inference
暂无分享,去创建一个
Alexander Rush | Alexander M. Rush | Gu-Yeon Wei | Vijay Janapa Reddi | David Brooks | Thierry Tambe | Yuntian Deng | Zishen Wan | En-Yu Yang | Gu-Yeon Wei | D. Brooks | V. Reddi | Yuntian Deng | En-Yu Yang | Zishen Wan | Thierry Tambe
[1] William J. Dally,et al. MAGNet: A Modular Accelerator Generator for Neural Networks , 2019, 2019 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).
[2] David A. Patterson,et al. In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[3] Daisuke Miyashita,et al. LogNet: Energy-efficient neural networks using logarithmic computation , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[4] Tim Salimans,et al. Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks , 2016, NIPS.
[5] Sachin S. Talathi,et al. Fixed Point Quantization of Deep Convolutional Networks , 2015, ICML.
[6] John L. Gustafson,et al. Beating Floating Point at its Own Game: Posit Arithmetic , 2017, Supercomput. Front. Innov..
[7] Patrick Judd,et al. Stripes: Bit-serial deep neural network computing , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[8] Jose-Maria Arnau,et al. Boosting LSTM Performance Through Dynamic Precision Selection , 2019, 2020 IEEE 27th International Conference on High Performance Computing, Data, and Analytics (HiPC).
[9] Xin Wang,et al. Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks , 2017, NIPS.
[10] Swagath Venkataramani,et al. Accurate and Efficient 2-bit Quantized Neural Networks , 2019, MLSys.
[11] Marian Verhelst,et al. Minimum energy quantized neural networks , 2017, 2017 51st Asilomar Conference on Signals, Systems, and Computers.
[12] Bohyung Han,et al. Regularizing Deep Neural Networks by Noise: Its Interpretation and Optimization , 2017, NIPS.
[13] Michael Wu,et al. Quantizing Convolutional Neural Networks for Low-Power High-Throughput Inference Engines , 2018, ArXiv.
[14] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.
[15] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[16] Christopher Torng,et al. A modular digital VLSI flow for high-productivity SoC design , 2018, DAC.
[17] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[18] Eunhyeok Park,et al. Energy-Efficient Neural Network Accelerator Based on Outlier-Aware Low-Precision Computation , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[19] Yoshua Bengio,et al. BinaryConnect: Training Deep Neural Networks with binary weights during propagations , 2015, NIPS.
[20] Alexander M. Rush,et al. OpenNMT: Open-Source Toolkit for Neural Machine Translation , 2017, ACL.
[21] Jing Li,et al. Adaptive Quantization of Neural Networks , 2018, ICLR.
[22] Jeff Johnson,et al. Rethinking floating point for deep learning , 2018, ArXiv.
[23] Yoshua Bengio,et al. Attention-Based Models for Speech Recognition , 2015, NIPS.
[24] Eric S. Chung,et al. A Configurable Cloud-Scale DNN Processor for Real-Time AI , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).