MetaDrive: Composing Diverse Driving Scenarios for Generalizable Reinforcement Learning

Driving safely requires multiple capabilities from human and intelligent agents, such as the generalizability to unseen environments, the decision making in complex multi-agent settings, and the safety awareness of the surrounding traffic. Despite the great success of reinforcement learning, most of the RL research studies each capability separately due to the lack of the integrated interactive environments. In this work, we develop a new driving simulation platform called MetaDrive for the study of generalizable reinforcement learning algorithms. MetaDrive is highly compositional, which can generate an infinite number of diverse driving scenarios from both the procedural generation and the real traffic data replay. Based on MetaDrive, we construct a variety of RL tasks and baselines in both single-agent and multi-agent settings, including benchmarking generalizability across unseen scenes, safe exploration, and learning multi-agent traffic. We opensource this simulator and maintain its development at https://github.com/ decisionforce/metadrive.

[1]  Ming Zhou,et al.  Mean Field Multi-Agent Reinforcement Learning , 2018, ICML.

[2]  Jakub W. Pachocki,et al.  Learning dexterous in-hand manipulation , 2018, Int. J. Robotics Res..

[3]  Simon Lucey,et al.  Argoverse: 3D Tracking and Forecasting With Rich Maps , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Dong Chen,et al.  SMARTS: Scalable Multi-Agent Reinforcement Learning Training School for Autonomous Driving , 2020, ArXiv.

[5]  Christos Dimitrakakis,et al.  TORCS, The Open Racing Car Simulator , 2005 .

[6]  Zihan Zhou,et al.  CityFlow: A Multi-Agent Reinforcement Learning Environment for Large Scale City Traffic Scenario , 2019, WWW.

[7]  Sergey Levine,et al.  Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning , 2019, CoRL.

[8]  Marc G. Bellemare,et al.  The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..

[9]  Yun-Pang Flötteröd,et al.  Microscopic Traffic Simulation using SUMO , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[10]  Pieter Abbeel,et al.  Constrained Policy Optimization , 2017, ICML.

[11]  Pieter Abbeel,et al.  Emergence of Grounded Compositional Language in Multi-Agent Populations , 2017, AAAI.

[12]  Sergey Levine,et al.  High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.

[13]  Dragomir Anguelov,et al.  Scalability in Perception for Autonomous Driving: Waymo Open Dataset , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Shimon Whiteson,et al.  Is Independent Learning All You Need in the StarCraft Multi-Agent Challenge? , 2020, ArXiv.

[15]  Alain L. Kornhauser,et al.  Beyond Grand Theft Auto V for Training, Testing and Enhancing Deep Learning in Self Driving Cars , 2017, ArXiv.

[16]  Alexandre M. Bayen,et al.  Flow: Architecture and Benchmarking for Reinforcement Learning in Traffic Control , 2017, ArXiv.

[17]  Shimon Whiteson,et al.  The StarCraft Multi-Agent Challenge , 2019, AAMAS.

[18]  Sergey Levine,et al.  Learning to Walk in the Real World with Minimal Human Effort , 2020, CoRL.

[19]  J. Schulman,et al.  Leveraging Procedural Generation to Benchmark Reinforcement Learning , 2019, ICML.

[20]  Praveen Palanisamy,et al.  Multi-Agent Connected Autonomous Driving using Deep Reinforcement Learning , 2019, 2020 International Joint Conference on Neural Networks (IJCNN).

[21]  Yuval Tassa,et al.  MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[22]  Dario Amodei,et al.  Benchmarking Safe Exploration in Deep Reinforcement Learning , 2019 .

[23]  Germán Ros,et al.  CARLA: An Open Urban Driving Simulator , 2017, CoRL.

[24]  Bernard Ghanem,et al.  Sim4CV: A Photo-Realistic Simulator for Computer Vision Applications , 2017, International Journal of Computer Vision.

[25]  Alexandre M. Bayen,et al.  Benchmarks for reinforcement learning in mixed-autonomy traffic , 2018, CoRL.

[26]  Dirk Helbing,et al.  General Lane-Changing Model MOBIL for Car-Following Models , 2007 .

[27]  Michael I. Jordan,et al.  RLlib: Abstractions for Distributed Reinforcement Learning , 2017, ICML.

[28]  Balaraman Ravindran,et al.  MADRaS : Multi Agent Driving Simulator , 2020, J. Artif. Intell. Res..

[29]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[30]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[31]  Sanja Fidler,et al.  Emergent Road Rules In Multi-Agent Driving Environments , 2020, ICLR.

[32]  Patrick Weber,et al.  OpenStreetMap: User-Generated Street Maps , 2008, IEEE Pervasive Computing.

[33]  Mark R. Mine,et al.  The Panda3D Graphics Engine , 2004, Computer.

[34]  Pieter Abbeel,et al.  Responsive Safety in Reinforcement Learning by PID Lagrangian Methods , 2020, ICML.

[35]  Silvio Savarese,et al.  iGibson 2.0: Object-Centric Simulation for Robot Learning of Everyday Household Tasks , 2021, CoRL.

[36]  Ashish Kapoor,et al.  AirSim: High-Fidelity Visual and Physical Simulation for Autonomous Vehicles , 2017, FSR.

[37]  David Hsu,et al.  SUMMIT: A Simulator for Urban Driving in Massive Mixed Traffic , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[38]  Sergey Levine,et al.  Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.