Learning from Simulation, Racing in Reality

We present a reinforcement learning-based solution to autonomously race on a miniature race car platform. We show that a policy that is trained purely in simulation using a relatively simple vehicle model, including model randomization, can be successfully transferred to the real robotic setup. We achieve this by using novel policy output regularization approach and a lifted action space which enables smooth actions but still aggressive race car driving. We show that this regularized policy does outperform the Soft Actor Critic (SAC) baseline method, both in simulation and on the real car, but it is still outperformed by a Model Predictive Controller (MPC) state of the art method. The refinement of the policy with three hours of real-world interaction data allows the reinforcement learning policy to achieve lap times similar to the MPC controller while reducing track constraint violations by 50%.

[1]  Alexander Liniger,et al.  Learning-Based Model Predictive Control for Autonomous Racing , 2019, IEEE Robotics and Automation Letters.

[2]  Martin Lauer,et al.  Learning Path Tracking for Real Car-like Mobile Robots From Simulation , 2019, 2019 European Conference on Mobile Robots (ECMR).

[3]  John Lygeros,et al.  Optimization-Based Hierarchical Motion Planning for Autonomous Racing , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[4]  Henryk Michalewski,et al.  Simulation-Based Reinforcement Learning for Real-World Autonomous Driving , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[5]  Holger Voos,et al.  Controller design for quadrotor UAVs using reinforcement learning , 2010, 2010 IEEE International Conference on Control Applications.

[6]  Roland Siegwart,et al.  Control of a Quadrotor With Reinforcement Learning , 2017, IEEE Robotics and Automation Letters.

[7]  Mario Zanon,et al.  Towards time-optimal race car driving using nonlinear MPC in real-time , 2014, 53rd IEEE Conference on Decision and Control.

[8]  Manfred Morari,et al.  Optimization‐based autonomous racing of 1:43 scale RC cars , 2015, ArXiv.

[9]  Daniela Rus,et al.  Learning Robust Control Policies for End-to-End Autonomous Driving From Data-Driven Simulation , 2020, IEEE Robotics and Automation Letters.

[10]  Steven Lake Waslander,et al.  Multi-agent quadrotor testbed control design: integral sliding mode vs. reinforcement learning , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[11]  Francesco Borrelli,et al.  Autonomous racing using learning Model Predictive Control , 2016, 2017 American Control Conference (ACC).

[12]  Juraj Kabzan,et al.  Cautious Model Predictive Control Using Gaussian Process Regression , 2017, IEEE Transactions on Control Systems Technology.

[13]  Wojciech Zaremba,et al.  Domain randomization for transferring deep neural networks from simulation to the real world , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[14]  Byron Boots,et al.  Imitation learning for agile autonomous driving , 2019, Int. J. Robotics Res..

[15]  Renaud Dubé,et al.  AMZ Driverless: The full autonomous racing system , 2019, J. Field Robotics.

[16]  John Lygeros,et al.  Efficient implementation of Randomized MPC for miniature race cars , 2016, 2016 European Control Conference (ECC).

[17]  Sergey Levine,et al.  Learning to Walk via Deep Reinforcement Learning , 2018, Robotics: Science and Systems.

[18]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[19]  David Janz,et al.  Learning to Drive in a Day , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[20]  Marcin Andrychowicz,et al.  Sim-to-Real Transfer of Robotic Control with Dynamics Randomization , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[21]  Sergey Levine,et al.  Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.

[22]  J. Christian Gerdes,et al.  Model Predictive Control for Vehicle Stabilization at the Limits of Handling , 2013, IEEE Transactions on Control Systems Technology.

[23]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[24]  Vladlen Koltun,et al.  Deep Drone Racing: From Simulation to Reality With Domain Randomization , 2019, IEEE Transactions on Robotics.

[25]  Marcin Andrychowicz,et al.  Solving Rubik's Cube with a Robot Hand , 2019, ArXiv.

[26]  Jacky Baltes,et al.  Path-Tracking Control of Non-Holonomic Car-Like Robot with Reinforcement Learning , 1999, New Zealand Computer Science Research Students' Conference.

[27]  Hans B. Pacejka,et al.  THE MAGIC FORMULA TYRE MODEL , 1991 .

[28]  Joonho Lee,et al.  Learning agile and dynamic motor skills for legged robots , 2019, Science Robotics.

[29]  Alexander Liniger,et al.  Path Planning and Control for Autonomous Racing , 2018 .

[30]  Trevor Darrell,et al.  Regularization Matters in Policy Optimization , 2019, ArXiv.

[31]  John Lygeros,et al.  Real-Time Control for Autonomous Racing Based on Viability Theory , 2017, IEEE Transactions on Control Systems Technology.

[32]  Swarat Chaudhuri,et al.  Control Regularization for Reduced Variance Reinforcement Learning , 2019, ICML.

[33]  Nolan Wagener,et al.  Information theoretic MPC for model-based reinforcement learning , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[34]  Davide Scaramuzza,et al.  Super-Human Performance in Gran Turismo Sport Using Deep Reinforcement Learning , 2020, ArXiv.

[35]  Alex Bewley,et al.  Learning to Drive from Simulation without Real World Labels , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[36]  Jing Peng,et al.  Function Optimization using Connectionist Reinforcement Learning Algorithms , 1991 .