Air Learning: An AI Research Platform for Algorithm-Hardware Benchmarking of Autonomous Aerial Robots

We introduce Air Learning, an AI research platform for benchmarking algorithm-hardware performance and energy efficiency trade-offs. We focus in particular on deep reinforcement learning (RL) interactions in autonomous unmanned aerial vehicles (UAVs). Equipped with a random environment generator, AirLearning exposes a UAV to a diverse set of challenging scenarios. Users can specify a task, train different RL policies and evaluate their performance and energy efficiency on a variety of hardware platforms. To show how Air Learning can be used, we seed it with Deep Q Networks (DQN) and Proximal Policy Optimization (PPO) to solve a point-to-point obstacle avoidance task in three different environments, generated using our configurable environment generator. We train the two algorithms using curriculum learning and non-curriculum-learning. Air Learning assesses the trained policies' performance, under a variety of quality-of-flight (QoF) metrics, such as the energy consumed, endurance and the average trajectory length, on resource-constrained embedded platforms like a Ras-Pi. We find that the trajectories on an embedded Ras-Pi are vastly different from those predicted on a high-end desktop system, resulting in up to 79.43% longer trajectories in one of the environments. To understand the source of such differences, we use Air Learning to artificially degrade desktop performance to mimic what happens on a low-end embedded system. QoF metrics with hardware-in-the-loop characterize those differences and expose how the choice of onboard compute affects the aerial robot's performance. We also conduct reliability studies to demonstrate how Air Learning can help understand how sensor failures affect the learned policies. All put together, Air Learning enables a broad class of RL studies on UAVs. More information and code for Air Learning can be found here: this http URL

[1]  Khaled M. Elbassioni,et al.  Flight Tour Planning with Recharging Optimization for Battery-operated Autonomous Drones , 2017, ArXiv.

[2]  Wenzhi Cui,et al.  MAVBench: Micro Aerial Vehicle Benchmarking , 2018, 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[3]  Adang Suwandi Ahmad,et al.  Hardware In The Loop Simulator in UAV Rapid Development Life Cycle , 2008, ArXiv.

[4]  Lydia Tapia,et al.  A reinforcement learning approach towards autonomous suspended load manipulation using aerial robots , 2013, 2013 IEEE International Conference on Robotics and Automation.

[5]  S.H. Hsiung,et al.  Unmanned Aerial Vehicle for infrastructure inspection with image processing for quantification of measurement and formation of facade map , 2017, 2017 International Conference on Applied System Innovation (ICASI).

[6]  Alan Fern,et al.  Imitation Learning with Demonstrations and Shaping Rewards , 2014, AAAI.

[7]  Roland Siegwart,et al.  Control of a Quadrotor With Reinforcement Learning , 2017, IEEE Robotics and Automation Letters.

[8]  Sergey Levine,et al.  QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation , 2018, CoRL.

[9]  Larry H. Matthies,et al.  Depth from stereo polarization in specular scenes for urban robotics , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[10]  Xin Zhang,et al.  End to End Learning for Self-Driving Cars , 2016, ArXiv.

[11]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[12]  Giuseppe Serazzi,et al.  Workload characterization: a survey , 1993, Proc. IEEE.

[13]  Michelle Menard,et al.  Game Development with Unity , 2011 .

[14]  Chiou-Shann Fuh,et al.  TRANSPARENT OBJECT DETECTION USING REGIONS WITH CONVOLUTIONAL NEURAL NETWORK , 2015 .

[15]  Vijay Kumar,et al.  Mixed Integer Quadratic Program trajectory generation for a quadrotor with a cable-suspended payload , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[16]  Michael F. P. O'Boyle,et al.  Introducing SLAMBench, a performance and accuracy benchmarking methodology for SLAM , 2014, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[17]  Obbu Chandra Sekhar,et al.  Design and fabrication of coulomb counter for estimation of SOC of battery , 2016, 2016 IEEE International Conference on Power Electronics, Drives and Energy Systems (PEDES).

[18]  Simha Sethumadhavan,et al.  RoboBench: Towards sustainable robotics system benchmarking , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[19]  Gu-Yeon Wei,et al.  Co-designing accelerators and SoC interfaces using gem5-Aladdin , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[20]  Stefan Ulbrich,et al.  OpenGRASP: A Toolkit for Robot Grasping Simulation , 2010, SIMPAR.

[21]  Jie Huang,et al.  The HiBench benchmark suite: Characterization of the MapReduce-based data analysis , 2010, 2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010).

[22]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[23]  Aleksandra Faust,et al.  Learning Navigation Behaviors End-to-End With AutoRL , 2018, IEEE Robotics and Automation Letters.

[24]  Luiz André Barroso,et al.  Memory system characterization of commercial workloads , 1998, ISCA.

[25]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[26]  J. How,et al.  Adaptive Flight Control Experiments using RAVEN , 2008 .

[27]  Steven Seidman,et al.  A synthesis of reinforcement learning and robust control theory , 2000 .

[28]  Rafael Fierro,et al.  Agile Load Transportation : Safe and Efficient Load Manipulation with Aerial Robots , 2012, IEEE Robotics & Automation Magazine.

[29]  Azer Bestavros,et al.  Reinforcement Learning for UAV Attitude Control , 2018, ACM Trans. Cyber Phys. Syst..

[30]  Sergey Levine,et al.  (CAD)$^2$RL: Real Single-Image Flight without a Single Real Image , 2016, Robotics: Science and Systems.

[31]  Song Han,et al.  HAQ: Hardware-Aware Automated Quantization , 2018, ArXiv.

[32]  Vijay Vasudevan,et al.  Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33]  A. Hammer Project title: , 1999 .

[34]  Raffaello D'Andrea,et al.  A simple learning strategy for high-speed quadrocopter multi-flips , 2010, 2010 IEEE International Conference on Robotics and Automation.

[35]  Lizy Kurian John,et al.  Analysis of redundancy and application balance in the SPEC CPU2006 benchmark suite , 2007, ISCA '07.

[36]  Angela P. Schoellig,et al.  Deep neural networks for improved, impromptu trajectory tracking of quadrotors , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[37]  Abhinav Gupta,et al.  Learning to fly by crashing , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[38]  Raffaello D'Andrea,et al.  Optimization-based iterative learning for precise quadrocopter trajectory tracking , 2012, Autonomous Robots.

[39]  Baoyuan Wu,et al.  Tencent ML-Images: A Large-Scale Multi-Label Image Database for Visual Representation Learning , 2019, IEEE Access.

[40]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[41]  Lydia Tapia,et al.  Learning swing-free trajectories for UAVs with a suspended load , 2013, 2013 IEEE International Conference on Robotics and Automation.

[42]  Vijay Kumar,et al.  Aggressive Flight With Suspended Payloads Using Vision-Based Control , 2018, IEEE Robotics and Automation Letters.

[43]  Kaivalya M. Dixit,et al.  The SPEC benchmarks , 1991, Parallel Comput..

[44]  Lydia Tapia,et al.  Automated aerial suspended cargo delivery through reinforcement learning , 2017, Artif. Intell..

[45]  Luca Iocchi,et al.  A unified benchmark framework for autonomous Mobile robots and Vehicles Motion Algorithms (MoVeMA benchmarks) , 2008 .

[46]  Nicola Valcasara Unreal Engine Game Development Blueprints , 2015 .

[47]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[48]  Agathoniki Trigoni,et al.  Supporting Search and Rescue Operations with UAVs , 2010, 2010 International Conference on Emerging Security Technologies.

[49]  Jianfeng Gao,et al.  Recurrent Reinforcement Learning: A Hybrid Approach , 2015, ArXiv.

[50]  Gerhard Neumann,et al.  Deep Reinforcement Learning for Swarm Systems , 2018, J. Mach. Learn. Res..

[51]  Bram Bakker,et al.  Reinforcement Learning with Long Short-Term Memory , 2001, NIPS.

[52]  Indranil Saha,et al.  Charging Station Placement for Indoor Robotic Applications , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[53]  Ashish Kapoor,et al.  AirSim: High-Fidelity Visual and Physical Simulation for Autonomous Vehicles , 2017, FSR.

[54]  Vijay Kumar,et al.  High speed navigation for quadrotors with limited onboard sensing , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[55]  Forrest N. Iandola,et al.  SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.

[56]  Andrew Howard,et al.  Design and use paradigms for Gazebo, an open-source multi-robot simulator , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[57]  Bruce Jacob,et al.  DRAMSim2: A Cycle Accurate Memory System Simulator , 2011, IEEE Computer Architecture Letters.

[58]  Jürgen Schmidhuber,et al.  A Machine Learning Approach to Visual Perception of Forest Trails for Mobile Robots , 2016, IEEE Robotics and Automation Letters.

[59]  Anne Goodchild,et al.  Delivery by drone: An evaluation of unmanned aerial vehicle technology in reducing CO 2 emissions in the delivery service industry , 2017, Transportation Research Part D: Transport and Environment.

[60]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[61]  Abhinav Gupta,et al.  Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[62]  Somayeh Sardashti,et al.  The gem5 simulator , 2011, CARN.

[63]  Angela P. Schoellig,et al.  Safe and robust learning control with Gaussian processes , 2015, 2015 European Control Conference (ECC).

[64]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[65]  Andreas Krause,et al.  Safe controller optimization for quadrotors with Gaussian processes , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[66]  Vijay Kumar,et al.  The GRASP Multiple Micro-UAV Testbed , 2010, IEEE Robotics & Automation Magazine.

[67]  Wojciech Zaremba,et al.  Domain randomization for transferring deep neural networks from simulation to the real world , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[68]  Vladlen Koltun,et al.  Deep Drone Racing: Learning Agile Flight in Dynamic Environments , 2018, CoRL.

[69]  Sergei Lupashin,et al.  A platform for aerial robotics research and demonstration: The Flying Machine Arena , 2014 .

[70]  Izabela Ewa Nielsen,et al.  A system of UAV application in indoor environment , 2016 .

[71]  Vijay Janapa Reddi,et al.  PIN: a binary instrumentation tool for computer architecture research and education , 2004, WCAE '04.

[72]  David Vandyke,et al.  Reward Shaping with Recurrent Neural Networks for Speeding up On-Line Policy Learning in Spoken Dialogue Systems , 2015, SIGDIAL Conference.