BERRY: Bit Error Robustness for Energy-Efficient Reinforcement Learning-Based Autonomous Systems

Autonomous systems, such as Unmanned Aerial Vehicles (UAVs), are expected to run complex reinforcement learning (RL) models to execute fully autonomous position-navigation-time tasks within stringent onboard weight and power constraints. We observe that reducing onboard operating voltage can benefit the energy efficiency of both the computation and flight mission, however, it can also result in on-chip bit failures that are detrimental to mission safety and performance. To this end, we propose BERRY, a robust learning framework to improve bit error robustness and energy efficiency for RL-enabled autonomous systems. BERRY supports robust learning, both offline and on-board the UAV, and for the first time, demonstrates the practicality of robust low-voltage operation on UAVs that leads to high energy savings in both compute-level operation and system-level quality-of-flight. We perform extensive experiments on 72 autonomous navigation scenarios and demonstrate that BERRY generalizes well across environments, UAVs, autonomy policies, operating voltages and fault patterns, and consistently improves robustness, efficiency and mission performance, achieving up to 15.62% reduction in flight energy, 18.51% increase in the number of successful missions, and 3.43x processing energy reduction.

[1]  Sabrina M. Neuman,et al.  Automatic Domain-Specific SoC Design for Autonomous Unmanned Aerial Vehicles , 2022, 2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO).

[2]  Kenneth E. Shepard,et al.  A 12nm Agile-Designed SoC for Swarm-Based Perception with Heterogeneous IP Blocks, a Reconfigurable Memory Hierarchy, and an 800MHz Multi-Plane NoC , 2022, ESSCIRC 2022- IEEE 48th European Solid State Circuits Conference (ESSCIRC).

[3]  V. Reddi,et al.  Roofline Model for UAVs: A Bottleneck Analysis Tool for Onboard Compute Characterization of Autonomous Unmanned Aerial Vehicles , 2022, 2022 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[4]  V. Reddi,et al.  FRL-FI: Transient Fault Analysis for Federated Reinforcement Learning-Based Navigation Systems , 2022, 2022 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[5]  A. Raychowdhury,et al.  Circuit and System Technologies for Energy-Efficient Edge Robotics: (Invited Paper) , 2022, 2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC).

[6]  Arijit Raychowdhury,et al.  Analyzing and Improving Fault Tolerance of Learning-Based Navigation Systems , 2021, 2021 58th ACM/IEEE Design Automation Conference (DAC).

[7]  Guido C. H. E. de Croon,et al.  Tiny Robot Learning (tinyRL) for Source Seeking on a Nano Quadcopter , 2021, 2021 IEEE International Conference on Robotics and Automation (ICRA).

[8]  Gu-Yeon Wei,et al.  MAVFI: An End-to-End Fault Analysis Framework with Anomaly Detection and Recovery for Micro Aerial Vehicles , 2021, 2023 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[9]  B. Schiele,et al.  Random and Adversarial Bit Error Robustness: Energy-Efficient and Secure DNN Accelerators , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Chao-Han Huck Yang,et al.  Training a Resilient Q-network against Observational Interference , 2021, AAAI.

[11]  Huan Zhang,et al.  Robust Reinforcement Learning on State Observations with Learned Optimal Adversary , 2021, ICLR.

[12]  Hamid Sarbazi-Azad,et al.  Understanding Power Consumption and Reliability of High-Bandwidth Memory with Voltage Underscaling , 2020, 2021 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[13]  Matthew Mattina,et al.  A Systematic Methodology for Characterizing Scalability of DNN Accelerators using SCALE-Sim , 2020, 2020 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[14]  Bernt Schiele,et al.  Bit Error Robustness for Energy-Efficient DNN Accelerators , 2020, MLSys.

[15]  Hamid Sarbazi-Azad,et al.  An Experimental Study of Reduced-Voltage Operation in Modern FPGAs for Neural Network Acceleration , 2020, 2020 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).

[16]  Gu-Yeon Wei,et al.  The Sky Is Not the Limit: A Visual Performance Model for Cyber-Physical Co-Design in Autonomous Machines , 2020, IEEE Computer Architecture Letters.

[17]  Anoop Korattikara Balan,et al.  Measuring the Reliability of Reinforcement Learning Algorithms , 2019, ICLR.

[18]  Vivienne Sze,et al.  Accelergy: An Architecture-Level Energy Estimation Methodology for Accelerator Designs , 2019, 2019 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[19]  Arijit Raychowdhury,et al.  Autonomous Navigation via Deep Reinforcement Learning for Resource Constraint Edge Nodes Using Transfer Learning , 2019, IEEE Access.

[20]  Onur Mutlu,et al.  EDEN: Enabling Energy-Efficient, High-Performance Deep Neural Network Inference Using Approximate DRAM , 2019, MICRO.

[21]  Gabriel Barth-Maron,et al.  QuaRL: Quantization for Fast and Environmentally Sustainable Reinforcement Learning , 2019, Trans. Mach. Learn. Res..

[22]  V. Reddi,et al.  Air Learning: a deep reinforcement learning gym for autonomous aerial robot visual navigation , 2019, Machine Learning.

[23]  Rajiv V. Joshi,et al.  Resilient Low Voltage Accelerators for High Energy Efficiency , 2019, 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[24]  Wenzhi Cui,et al.  MAVBench: Micro Aerial Vehicle Benchmarking , 2018, 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[25]  Luca Carlone,et al.  Navion: A 2-mW Fully Integrated Real-Time Visual-Inertial Odometry Accelerator for Autonomous Navigation of Nano Drones , 2018, IEEE Journal of Solid-State Circuits.

[26]  Luca Benini,et al.  A 64-mW DNN-Based Visual Navigation Engine for Autonomous Nano-Drones , 2018, IEEE Internet of Things Journal.

[27]  Kartheek Rangineni,et al.  ThUnderVolt: Enabling Aggressive Voltage Underscaling and Timing Error Resilience for Energy Efficient Deep Learning Accelerators , 2018, 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC).

[28]  John Kalamatianos,et al.  On characterizing near-threshold SRAM failures in FinFET technology , 2017, 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC).

[29]  Thierry Moreau,et al.  MATIC: Learning around errors for efficient low-voltage neural network accelerators , 2017, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[30]  Gu-Yeon Wei,et al.  14.3 A 28nm SoC with a 1.2GHz 568nJ/prediction sparse deep-neural-network engine with >0.1 timing error rate tolerance for IoT applications , 2017, 2017 IEEE International Solid-State Circuits Conference (ISSCC).

[31]  Gu-Yeon Wei,et al.  Minerva: Enabling Low-Power, Highly-Accurate Deep Neural Network Accelerators , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[32]  A. Raychowdhury,et al.  VPP: The Vulnerability-Proportional Protection Paradigm Towards Reliable Autonomous Machines , 2023 .