论文信息 - On the Resilience of Deep Learning for Reduced-voltage FPGAs

On the Resilience of Deep Learning for Reduced-voltage FPGAs

Deep Neural Networks (DNNs) are inherently computation-intensive and also power-hungry. Hardware accelerators such as Field Programmable Gate Arrays (FPGAs) are a promising solution that can satisfy these requirements for both embedded and High-Performance Computing (HPC) systems. In FPGAs, as well as CPUs and GPUs, aggressive voltage scaling below the nominal level is an effective technique for power dissipation minimization. Unfortunately, bit-flip faults start to appear as the voltage is scaled down closer to the transistor threshold due to timing issues, thus creating a resilience issue.This paper experimentally evaluates the resilience of the training phase of DNNs in the presence of voltage underscaling related faults of FPGAs, especially in on-chip memories. Toward this goal, we have experimentally evaluated the resilience of LeNet-5 and also a specially designed network for CIFAR-10 dataset with different activation functions of Rectified Linear Unit (Relu) and Hyperbolic Tangent (Tanh). We have found that modern FPGAs are robust enough in extremely low-voltage levels and that low-voltage related faults can be automatically masked within the training iterations, so there is no need for costly software-or hardware-oriented fault mitigation techniques like ECC. Approximately 10% more training iterations are needed to fill the gap in the accuracy. This observation is the result of the relatively low rate of undervolting faults, i.e., <0.1%, measured on real FPGA fabrics. We have also increased the fault rate significantly for the LeNet-5 network by randomly generated fault injection campaigns and observed that the training accuracy starts to degrade. When the fault rate increases, the network with Tanh activation function outperforms the one with Relu in terms of accuracy, e.g., when the fault rate is 30% the accuracy difference is 4.92%.

[1] Davide Rossi,et al. Impact of Memory Voltage Scaling on Accuracy and Resilience of Deep Learning Based Edge Devices , 2020, IEEE Design & Test.

[2] R. Stephenson. A and V , 1962, The British journal of ophthalmology.

[3] Onur Mutlu,et al. EDEN: Enabling Energy-Efficient, High-Performance Deep Neural Network Inference Using Approximate DRAM , 2019, MICRO.

[4] Neil Genzlinger. A. and Q , 2006 .

[5] Rajiv V. Joshi,et al. Resilient Low Voltage Accelerators for High Energy Efficiency , 2019, 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[6] Andreas Peter Burg,et al. FPGA-Based Emulation of Embedded DRAMs for Statistical Error Resilience Evaluation of Approximate Computing Systems , 2019, 2019 56th ACM/IEEE Design Automation Conference (DAC).

[7] Dimitris Gizopoulos,et al. Assessing the Effects of Low Voltage in Branch Prediction Units , 2019, 2019 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[8] Sameep Mehta,et al. A Survey on Resilient Machine Learning , 2017, ArXiv.

[9] Yu Wang,et al. A Survey of FPGA-Based Neural Network Accelerator , 2017, 1712.08934.

[10] David Blaauw,et al. Razor: A Low-Power Pipeline Based on Circuit-Level Timing Speculation , 2003, MICRO.

[11] Behzad Salami,et al. HATCH: Hash Table Caching in Hardware for Efficient Relational Join on FPGA , 2015, 2015 IEEE 23rd Annual International Symposium on Field-Programmable Custom Computing Machines.

[12] Hussam Amrouch,et al. Selecting the Optimal Energy Point in Near-Threshold Computing , 2019, 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[13] Pradip Bose,et al. Voltage Noise in Multi-Core Processors: Empirical Characterization and Optimization Opportunities , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.

[14] Thierry Moreau,et al. MATIC: Learning around errors for efficient low-voltage neural network accelerators , 2017, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[15] Osman S. Unsal,et al. On the Resilience of RTL NN Accelerators: Fault Characterization and Mitigation , 2018, 2018 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD).

[16] Dara Rahmati,et al. SkippyNN: An Embedded Stochastic-Computing Accelerator for Convolutional Neural Networks , 2019, 2019 56th ACM/IEEE Design Automation Conference (DAC).

[17] Behzad Salami,et al. AxleDB: A novel programmable query processing platform on FPGA , 2017, Microprocess. Microsystems.

[18] Swaroop Ghosh,et al. Sensitivity based Error Resilient Techniques for Energy Efficient Deep Neural Network Accelerators , 2019, 2019 56th ACM/IEEE Design Automation Conference (DAC).

[19] Jason Cong,et al. Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks , 2015, FPGA.

[20] Thierry Moreau,et al. Energy-Efficient Neural Network Acceleration in the Presence of Bit-Level Memory Errors , 2018, IEEE Transactions on Circuits and Systems I: Regular Papers.

[21] Osman S. Unsal,et al. Accelerating Hash-Based Query Processing Operations on FPGAs by a Hash Table Caching Technique , 2016, CARLA.

[22] Gu-Yeon Wei,et al. Minerva: Enabling Low-Power, Highly-Accurate Deep Neural Network Accelerators , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[23] Eriko Nurvitadhi,et al. Accelerating Binarized Neural Networks: Comparison of FPGA, CPU, GPU, and ASIC , 2016, 2016 International Conference on Field-Programmable Technology (FPT).

[24] Behzad Salami,et al. Aggressive undervolting of FPGAs : power & reliability trade-offs , 2018 .

[25] W. Marsden. I and J , 2012 .

[26] Guanpeng Li,et al. Understanding Error Propagation in Deep Learning Neural Network (DNN) Accelerators and Applications , 2017, SC17: International Conference for High Performance Computing, Networking, Storage and Analysis.

[27] Valerio Schiavoni,et al. LEGaTO: Low-Energy, Secure, and Resilient Toolset for Heterogeneous Computing , 2019, 2020 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[28] Onur Mutlu,et al. Understanding Reduced-Voltage Operation in Modern DRAM Devices , 2017, Proc. ACM Meas. Anal. Comput. Syst..

[29] Nam Sung Kim,et al. FlexiGAN: An End-to-End Solution for FPGA Acceleration of Generative Adversarial Networks , 2018, 2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).

[30] Yu Wang,et al. FPGA Acceleration of Recurrent Neural Network Based Language Model , 2015, 2015 IEEE 23rd Annual International Symposium on Field-Programmable Custom Computing Machines.

[31] Osman S. Unsal,et al. Modern Hardware Margins: CPUs, GPUs, FPGAs Recent System-Level Studies , 2019, 2019 IEEE 25th International Symposium on On-Line Testing and Robust System Design (IOLTS).

[32] Luca Benini,et al. XNOR Neural Engine: A Hardware Accelerator IP for 21.6-fJ/op Binary Neural Network Inference , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[33] Sanghamitra Roy,et al. GreenTPU: Improving Timing Error Resilience of a Near-Threshold Tensor Processing Unit , 2019, 2019 56th ACM/IEEE Design Automation Conference (DAC).

[34] Ahmad Shawahna,et al. FPGA-Based Accelerators of Deep Learning Networks for Learning and Classification: A Review , 2019, IEEE Access.

[35] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[36] Kartheek Rangineni,et al. ThUnderVolt: Enabling Aggressive Voltage Underscaling and Timing Error Resilience for Energy Efficient Deep Learning Accelerators , 2018, 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC).

[37] G. G. Stokes. "J." , 1890, The New Yale Book of Quotations.

[38] P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[39] Vincent Gripon,et al. Training Modern Deep Neural Networks for Memory-Fault Robustness , 2019, 2019 IEEE International Symposium on Circuits and Systems (ISCAS).

[40] Behzad Salami,et al. A Novel FPGA-Based High Throughput Accelerator For Binary Search Trees , 2019, 2019 International Conference on High Performance Computing & Simulation (HPCS).

[41] Osman S. Unsal,et al. Evaluating Built-In ECC of FPGA On-Chip Memories for the Mitigation of Undervolting Faults , 2019, 2019 27th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP).

[42] Siddharth Garg,et al. Fault-Tolerant Systolic Array Based Accelerators for Deep Neural Network Execution , 2019, IEEE Design & Test.

[43] Tajana Simunic,et al. Workload-Aware Opportunistic Energy Efficiency in Multi-FPGA Platforms , 2019, 2019 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[44] Behzad Salami,et al. Hardware Acceleration for Query Processing: Leveraging FPGAs, CPUs, and Memory , 2016, Computing in Science & Engineering.

[45] Mattan Erez,et al. Assessing the impact of timing errors on HPC applications , 2019, SC.

[46] Yann LeCun,et al. The mnist database of handwritten digits , 2005 .

[47] Jeff Zhang,et al. Analyzing and mitigating the impact of permanent faults on a systolic array based neural network accelerator , 2018, 2018 IEEE 36th VLSI Test Symposium (VTS).

[48] Eriko Nurvitadhi,et al. Can FPGAs Beat GPUs in Accelerating Next-Generation Deep Neural Networks? , 2017, FPGA.

[49] Osman S. Unsal,et al. Comprehensive Evaluation of Supply Voltage Underscaling in FPGA on-Chip Memories , 2018, 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[50] Valerio Schiavoni,et al. LEGaTO: towards energy-efficient, secure, fault-tolerant toolset for heterogeneous computing , 2018, CF.

[51] Tsuyoshi Murata,et al. {m , 1934, ACML.