Learning from Simulation, Racing in Reality

We present a reinforcement learning-based solution to autonomously race on a miniature race car platform. We show that a policy that is trained purely in simulation using a relatively simple vehicle model, including model randomization, can be successfully transferred to the real robotic setup. We achieve this by using a novel policy output regularization approach and a lifted action space which enables smooth actions but still aggressive race car driving. We show that this regularized policy does outperform the Soft Actor Critic (SAC) baseline method, both in simulation and on the real car, but it is still outperformed by a Model Predictive Controller (MPC) state-of-the-art method. The refinement of the policy with three hours of real-world interaction data allows the reinforcement learning policy to achieve lap times similar to the MPC controller while reducing track constraint violations by 50%.

[1]  J. Christian Gerdes,et al.  Model Predictive Control for Vehicle Stabilization at the Limits of Handling , 2013, IEEE Transactions on Control Systems Technology.

[2]  Jing Peng,et al.  Function Optimization using Connectionist Reinforcement Learning Algorithms , 1991 .

[3]  Alexander Liniger,et al.  Learning-Based Model Predictive Control for Autonomous Racing , 2019, IEEE Robotics and Automation Letters.

[4]  Byron Boots,et al.  Imitation learning for agile autonomous driving , 2019, Int. J. Robotics Res..

[5]  David Janz,et al.  Learning to Drive in a Day , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[6]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[7]  Martin Lauer,et al.  Learning Path Tracking for Real Car-like Mobile Robots From Simulation , 2019, 2019 European Conference on Mobile Robots (ECMR).

[8]  Francesco Borrelli,et al.  Autonomous racing using learning Model Predictive Control , 2016, 2017 American Control Conference (ACC).

[9]  Henryk Michalewski,et al.  Simulation-Based Reinforcement Learning for Real-World Autonomous Driving , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[10]  Swarat Chaudhuri,et al.  Control Regularization for Reduced Variance Reinforcement Learning , 2019, ICML.

[11]  John Lygeros,et al.  Optimization-Based Hierarchical Motion Planning for Autonomous Racing , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[12]  Davide Scaramuzza,et al.  Super-Human Performance in Gran Turismo Sport Using Deep Reinforcement Learning , 2020, IEEE Robotics and Automation Letters.

[13]  John Lygeros,et al.  Efficient implementation of Randomized MPC for miniature race cars , 2016, 2016 European Control Conference (ECC).

[14]  Nolan Wagener,et al.  Information theoretic MPC for model-based reinforcement learning , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[15]  Renaud Dubé,et al.  AMZ Driverless: The full autonomous racing system , 2019, J. Field Robotics.

[16]  Sergey Levine,et al.  Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.

[17]  Vladlen Koltun,et al.  Deep Drone Racing: From Simulation to Reality With Domain Randomization , 2019, IEEE Transactions on Robotics.

[18]  Wojciech Zaremba,et al.  Domain randomization for transferring deep neural networks from simulation to the real world , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[19]  Trevor Darrell,et al.  Regularization Matters in Policy Optimization , 2019, ArXiv.

[20]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[21]  Daniela Rus,et al.  Learning Robust Control Policies for End-to-End Autonomous Driving From Data-Driven Simulation , 2020, IEEE Robotics and Automation Letters.

[22]  Mario Zanon,et al.  Towards time-optimal race car driving using nonlinear MPC in real-time , 2014, 53rd IEEE Conference on Decision and Control.

[23]  Jacky Baltes,et al.  Path-Tracking Control of Non-Holonomic Car-Like Robot with Reinforcement Learning , 1999, New Zealand Computer Science Research Students' Conference.

[24]  Marcin Andrychowicz,et al.  Sim-to-Real Transfer of Robotic Control with Dynamics Randomization , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[25]  Steven Lake Waslander,et al.  Multi-agent quadrotor testbed control design: integral sliding mode vs. reinforcement learning , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[26]  Manfred Morari,et al.  Optimization‐based autonomous racing of 1:43 scale RC cars , 2015, ArXiv.

[27]  Alex Bewley,et al.  Learning to Drive from Simulation without Real World Labels , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[28]  Hans B. Pacejka,et al.  THE MAGIC FORMULA TYRE MODEL , 1991 .

[29]  Juraj Kabzan,et al.  Cautious Model Predictive Control Using Gaussian Process Regression , 2017, IEEE Transactions on Control Systems Technology.

[30]  Marcin Andrychowicz,et al.  Solving Rubik's Cube with a Robot Hand , 2019, ArXiv.

[31]  Joonho Lee,et al.  Learning agile and dynamic motor skills for legged robots , 2019, Science Robotics.

[32]  Alexander Liniger,et al.  Path Planning and Control for Autonomous Racing , 2018 .

[33]  Holger Voos,et al.  Controller design for quadrotor UAVs using reinforcement learning , 2010, 2010 IEEE International Conference on Control Applications.

[34]  Roland Siegwart,et al.  Control of a Quadrotor With Reinforcement Learning , 2017, IEEE Robotics and Automation Letters.

[35]  Sergey Levine,et al.  Learning to Walk via Deep Reinforcement Learning , 2018, Robotics: Science and Systems.

[36]  John Lygeros,et al.  Real-Time Control for Autonomous Racing Based on Viability Theory , 2017, IEEE Transactions on Control Systems Technology.