ThUnderVolt: Enabling Aggressive Voltage Underscaling and Timing Error Resilience for Energy Efficient Deep Learning Accelerators
暂无分享,去创建一个
Kartheek Rangineni | Zahra Ghodsi | Siddharth Garg | Jeff Zhang | S. Garg | Zahra Ghodsi | Jeff Zhang | Kartheek Rangineni
[1] Sherief Reda,et al. Understanding the impact of precision quantization on the accuracy and energy of neural networks , 2016, Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017.
[2] Yann LeCun,et al. Regularization of Neural Networks using DropConnect , 2013, ICML.
[3] Manoj Alwani,et al. Fused-layer CNN accelerators , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[4] Siddharth Garg,et al. BandiTS: Dynamic timing speculation using multi-armed bandit based optimization , 2017, Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017.
[5] Rajesh K. Gupta,et al. An assessment of vulnerability of hardware neural networks to dynamic voltage and temperature variations , 2017, 2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).
[6] Deming Chen,et al. CCP: common case promotion for improved timing error resilience with energy efficiency , 2012, ISLPED '12.
[7] Kiyoung Choi,et al. Efficient FPGA acceleration of Convolutional Neural Networks using logical-3D compute array , 2016, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[8] Gu-Yeon Wei,et al. 14.3 A 28nm SoC with a 1.2GHz 568nJ/prediction sparse deep-neural-network engine with >0.1 timing error rate tolerance for IoT applications , 2017, 2017 IEEE International Solid-State Circuits Conference (ISSCC).
[9] Naresh R. Shanbhag,et al. Soft digital signal processing , 2001, IEEE Trans. Very Large Scale Integr. Syst..
[10] William J. Dally,et al. SCNN: An accelerator for compressed-sparse convolutional neural networks , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[11] Jeff Zhang,et al. Analyzing and mitigating the impact of permanent faults on a systolic array based neural network accelerator , 2018, 2018 IEEE 36th VLSI Test Symposium (VTS).
[12] Naoya Onizawa,et al. VLSI Implementation of Deep Neural Network Using Integral Stochastic Computing , 2017, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[13] David Blaauw,et al. Razor: circuit-level correction of timing errors for low-power operation , 2004, IEEE Micro.
[14] Srihari Cadambi,et al. A Massively Parallel Coprocessor for Convolutional Neural Networks , 2009, 2009 20th IEEE International Conference on Application-specific Systems, Architectures and Processors.
[15] Tianshi Chen,et al. ShiDianNao: Shifting vision processing closer to the sensor , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[16] Qinru Qiu,et al. SC-DCNN: Highly-Scalable Deep Convolutional Neural Network using Stochastic Computing , 2016, ASPLOS.
[17] Ana Margarida de Jesus,et al. Improving Methods for Single-label Text Categorization , 2007 .
[18] Kaushik Roy,et al. Multiplier-less Artificial Neurons exploiting error resiliency for energy-efficient neural computing , 2016, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[19] Gu-Yeon Wei,et al. Minerva: Enabling Low-Power, Highly-Accurate Deep Neural Network Accelerators , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[20] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.
[21] Mehdi Kamal,et al. Lifetime improvement by exploiting aggressive voltage scaling during runtime of error-resilient applications , 2017, Integr..
[22] Natalie D. Enright Jerger,et al. Cnvlutin: Ineffectual-Neuron-Free Deep Neural Network Computing , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[23] David Blaauw,et al. Bubble Razor: An architecture-independent approach to timing-error detection and correction , 2012, 2012 IEEE International Solid-State Circuits Conference.
[24] Nitish Srivastava,et al. Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.
[25] Dongyoung Kim,et al. ZeNA: Zero-Aware Neural Network Accelerator , 2018, IEEE Design & Test.
[26] Josep Torrellas,et al. Blueshift: Designing processors for timing speculation from the ground up. , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.
[27] Qiang Xu,et al. Re-synthesis for cost-efficient circuit-level timing speculation , 2011, 2011 48th ACM/EDAC/IEEE Design Automation Conference (DAC).
[28] Hu Chen,et al. Synergistic timing speculation for multi-threaded programs , 2016, DAC.
[29] David M. Bull,et al. RazorII: In Situ Error Detection and Correction for PVT and SER Tolerance , 2009, IEEE Journal of Solid-State Circuits.
[30] Pritish Narayanan,et al. Deep Learning with Limited Numerical Precision , 2015, ICML.
[31] Rob A. Rutenbar,et al. Error Resilient and Energy Efficient MRF Message-Passing-Based Stereo Matching , 2016, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[32] Shohaib Aboobacker. RAZOR: circuit-level correction of timing errors for low-power operation , 2011 .
[33] Yiran Chen,et al. Learning Structured Sparsity in Deep Neural Networks , 2016, NIPS.
[34] Robert C. Aitken,et al. TIMBER: Time borrowing and error relaying for online timing error resilience , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).
[35] Peng Zhang,et al. Automated systolic array architecture synthesis for high throughput CNN inference on FPGAs , 2017, 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC).
[36] Srihari Cadambi,et al. A dynamically configurable coprocessor for convolutional neural networks , 2010, ISCA.
[37] Izzat Darwazeh,et al. Circuit-Level Timing Error Tolerance for Low-Power DSP Filters and Transforms , 2013, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[38] Andreas Gerstlauer,et al. High-level synthesis of approximate hardware under joint precision and voltage scaling , 2017, Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017.
[39] David A. Patterson,et al. In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[40] Rich Caruana,et al. Do Deep Nets Really Need to be Deep? , 2013, NIPS.
[41] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[42] Alex Krizhevsky,et al. One weird trick for parallelizing convolutional neural networks , 2014, ArXiv.
[43] Jürgen Schmidhuber,et al. Deep learning in neural networks: An overview , 2014, Neural Networks.
[44] Scott A. Mahlke,et al. Scalpel: Customizing DNN pruning to the underlying hardware parallelism , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[45] H. T. Kung. Why systolic architectures? , 1982, Computer.