SMARTS: Scalable Multi-Agent Reinforcement Learning Training School for Autonomous Driving

Multi-agent interaction is a fundamental aspect of autonomous driving in the real world. Despite more than a decade of research and development, the problem of how to competently interact with diverse road users in diverse scenarios remains largely unsolved. Learning methods have much to offer towards solving this problem. But they require a realistic multi-agent simulator that generates diverse and competent driving interactions. To meet this need, we develop a dedicated simulation platform called SMARTS (Scalable Multi-Agent RL Training School). SMARTS supports the training, accumulation, and use of diverse behavior models of road users. These are in turn used to create increasingly more realistic and diverse interactions that enable deeper and broader research on multi-agent interaction. In this paper, we describe the design goals of SMARTS, explain its basic architecture and its key features, and illustrate its use through concrete multi-agent experiments on interactive scenarios. We open-source the SMARTS platform and the associated benchmark tasks and evaluation metrics to encourage and empower research on multi-agent learning for autonomous driving. Our code is available at this https URL.

[1]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[2]  Thomas de Quincey [C] , 2000, The Works of Thomas De Quincey, Vol. 1: Writings, 1799–1820.

[3]  Daniel Krajzewicz,et al.  SUMO (Simulation of Urban MObility) - an open-source traffic simulation , 2002 .

[4]  Peter Stone,et al.  Multiagent traffic management: a reservation-based intersection control mechanism , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[5]  Jaume Barceló,et al.  Microscopic traffic simulation: A tool for the design, analysis and evaluation of intelligent transport systems , 2005, J. Intell. Robotic Syst..

[6]  Tim Roughgarden,et al.  Selfish routing and the price of anarchy , 2005 .

[7]  Sanjiv Singh,et al.  The 2005 DARPA Grand Challenge: The Great Robot Race , 2007 .

[8]  Josep Perarnau,et al.  Traffic Simulation with Aimsun , 2010 .

[9]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[10]  Yuval Tassa,et al.  MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[11]  Adam W. Ruch Grand Theft Auto IV , 2012, Games Cult..

[12]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[13]  Daniele Loiacono,et al.  Simulated Car Racing Championship: Competition Software Manual , 2013, ArXiv.

[14]  Malte Risto,et al.  The social behavior of autonomous vehicles , 2016, UbiComp Adjunct.

[15]  Rob Fergus,et al.  Learning Multiagent Communication with Backpropagation , 2016, NIPS.

[16]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[17]  Amnon Shashua,et al.  Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving , 2016, ArXiv.

[18]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[19]  Germán Ros,et al.  CARLA: An Open Urban Driving Simulator , 2017, CoRL.

[20]  Alexandre M. Bayen,et al.  Flow: Architecture and Benchmarking for Reinforcement Learning in Traffic Control , 2017, ArXiv.

[21]  Demis Hassabis,et al.  Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm , 2017, ArXiv.

[22]  Tom Schaul,et al.  StarCraft II: A New Challenge for Reinforcement Learning , 2017, ArXiv.

[23]  Peng Peng,et al.  Multiagent Bidirectionally-Coordinated Nets: Emergence of Human-level Coordination in Learning to Play StarCraft Combat Games , 2017, 1703.10069.

[24]  Yi Wu,et al.  Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.

[25]  Ashish Kapoor,et al.  AirSim: High-Fidelity Visual and Physical Simulation for Autonomous Vehicles , 2017, FSR.

[26]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[27]  Shimon Whiteson,et al.  Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.

[28]  Tamer Basar,et al.  Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents , 2018, ICML.

[29]  Tamer Basar,et al.  Networked Multi-Agent Reinforcement Learning in Continuous Spaces , 2018, 2018 IEEE Conference on Decision and Control (CDC).

[30]  Ming Zhou,et al.  Mean Field Multi-Agent Reinforcement Learning , 2018, ICML.

[31]  Michael I. Jordan,et al.  RLlib: Abstractions for Distributed Reinforcement Learning , 2017, ICML.

[32]  Guy Lever,et al.  Value-Decomposition Networks For Cooperative Multi-Agent Learning Based On Team Reward , 2018, AAMAS.

[33]  Michael I. Jordan,et al.  Ray: A Distributed Framework for Emerging AI Applications , 2017, OSDI.

[34]  Shimon Whiteson,et al.  QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning , 2018, ICML.

[35]  Wojciech M. Czarnecki,et al.  Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.

[36]  Nan Xu,et al.  CoLight: Learning Network-level Cooperation for Traffic Signal Control , 2019, CIKM.

[37]  Jun Wang,et al.  Multi-Agent Reinforcement Learning , 2020, Deep Reinforcement Learning.

[38]  Yung Yi,et al.  QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning , 2019, ICML.

[39]  Shimon Whiteson,et al.  MAVEN: Multi-Agent Variational Exploration , 2019, NeurIPS.

[40]  Ying Wen,et al.  A Regularized Opponent Model with Maximum Entropy Objective , 2019, IJCAI.

[41]  Shimon Whiteson,et al.  The StarCraft Multi-Agent Challenge , 2019, AAMAS.

[42]  Zihan Zhou,et al.  CityFlow: A Multi-Agent Reinforcement Learning Environment for Large Scale City Traffic Scenario , 2019, WWW.

[43]  Raymond H. Putra,et al.  Determinantal Reinforcement Learning , 2019, AAAI.

[44]  Tobias Meisen,et al.  Bézier Curve Based Continuous and Smooth Motion Planning for Self-Learning Industrial Robots , 2019, Procedia Manufacturing.

[45]  David Janz,et al.  Learning to Drive in a Day , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[46]  Praveen Palanisamy,et al.  Multi-Agent Connected Autonomous Driving using Deep Reinforcement Learning , 2019, 2020 International Joint Conference on Neural Networks (IJCNN).

[47]  Panagiotis Patrinos,et al.  OpEn: Code Generation for Embedded Nonconvex Optimization , 2020, IFAC-PapersOnLine.

[48]  Tobias Kessler,et al.  BARK: Open Behavior Benchmarking in Multi-Agent Environments , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[49]  Jonathan P. How,et al.  R-MADDPG for Partially Observable Environments and Limited Communication , 2019, ArXiv.

[50]  Yaodong Yang,et al.  Bi-level Actor-Critic for Multi-agent Coordination , 2019, AAAI.

[51]  Alberto Ferreira de Souza,et al.  Self-Driving Cars: A Survey , 2019, Expert Syst. Appl..

[52]  Balaraman Ravindran,et al.  MADRaS : Multi Agent Driving Simulator , 2020, J. Artif. Intell. Res..

[53]  Kaiqing Zhang,et al.  Finite-Sample Analysis for Decentralized Batch Multiagent Reinforcement Learning With Networked Agents , 2018, IEEE Transactions on Automatic Control.

[54]  Taxonomy and definitions for terms related to driving automation systems for on-road motor vehicles , 2022 .