Multi-Agent Connected Autonomous Driving using Deep Reinforcement Learning

The capability to learn and adapt to changes in the driving environment is crucial for developing autonomous driving systems that are scalable beyond geo-fenced operational design domains. Deep Reinforcement Learning (RL) provides a promising and scalable framework for developing adaptive learning based solutions. Deep RL methods usually model the problem as a (Partially Observable) Markov Decision Process in which an agent acts in a stationary environment to learn an optimal behavior policy. However, driving involves complex interaction between multiple, intelligent (artificial or human) agents in a highly non-stationary environment. In this paper, we propose the use of Partially Observable Markov Games(POSG) for formulating the connected autonomous driving problems with realistic assumptions. We provide a taxonomy of multi-agent learning environments based on the nature of tasks, nature of agents and the nature of the environment to help in categorizing various autonomous driving problems that can be addressed under the proposed formulation. As our main contributions, we provide MACAD-Gym, a Multi-Agent Connected, Autonomous Driving agent learning platform for furthering research in this direction. Our MACAD-Gym platform provides an extensible set of Connected Autonomous Driving (CAD) simulation environments that enable the research and development of Deep RL- based integrated sensing, perception, planning and control algorithms for CAD systems with unlimited operational design domain under realistic, multi-agent settings. We also share the MACAD-Agents that were trained successfully using the MACAD-Gym platform to learn control policies for multiple vehicle agents in a partially observable, stop-sign controlled, 3-way urban intersection environment with raw (camera) sensor observations.

[1]  David Janz,et al.  Learning to Drive in a Day , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[2]  Ashish Kapoor,et al.  AirSim: High-Fidelity Visual and Physical Simulation for Autonomous Vehicles , 2017, FSR.

[3]  David Isele,et al.  Navigating Occluded Intersections with Autonomous Vehicles Using Deep Reinforcement Learning , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[4]  Peter Stone,et al.  Autonomous agents modelling other agents: A comprehensive survey and open problems , 2017, Artif. Intell..

[5]  Germán Ros,et al.  CARLA: An Open Urban Driving Simulator , 2017, CoRL.

[6]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[7]  L. Shapley,et al.  Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.

[8]  Etienne Perot,et al.  Deep Reinforcement Learning framework for Autonomous Driving , 2017, Autonomous Vehicles and Machines.

[9]  Ching-Yao Chan,et al.  A Reinforcement Learning Based Approach for Automated Lane Change Maneuvers , 2018, 2018 IEEE Intelligent Vehicles Symposium (IV).

[10]  Sean Luke,et al.  Cooperative Multi-Agent Learning: The State of the Art , 2005, Autonomous Agents and Multi-Agent Systems.

[11]  John B. Kenney,et al.  Dedicated Short-Range Communications (DSRC) Standards in the United States , 2011, Proceedings of the IEEE.

[12]  Frans A. Oliehoek,et al.  Decentralized POMDPs , 2012, Reinforcement Learning.

[13]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[14]  Shane Legg,et al.  IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.

[15]  Michael I. Jordan,et al.  RLlib: Abstractions for Distributed Reinforcement Learning , 2017, ICML.

[16]  Shimon Whiteson,et al.  Learning to Communicate with Deep Multi-Agent Reinforcement Learning , 2016, NIPS.

[17]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[18]  Yi Wu,et al.  Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.

[19]  Ion Stoica,et al.  Ray RLLib: A Composable and Scalable Reinforcement Learning Library , 2017, NIPS 2017.

[20]  David Budden,et al.  Distributed Prioritized Experience Replay , 2018, ICLR.

[21]  John M. Dolan,et al.  Learning On-Road Visual Control for Self-Driving Vehicles With Auxiliary Tasks , 2018, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[22]  John M. Dolan,et al.  POMDP and Hierarchical Options MDP with Continuous Actions for Autonomous Driving at Intersections , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[23]  David Isele,et al.  Navigating Intersections with Autonomous Vehicles using Deep Reinforcement Learning , 2017 .

[24]  Gabriel Dulac-Arnold,et al.  Challenges of Real-World Reinforcement Learning , 2019, ArXiv.

[25]  Taxonomy and definitions for terms related to driving automation systems for on-road motor vehicles , 2022 .

[26]  Jeff G. Schneider,et al.  Approximate solutions for partially observable stochastic games with common payoffs , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[27]  Peter Stone,et al.  Deep Recurrent Q-Learning for Partially Observable MDPs , 2015, AAAI Fall Symposia.

[28]  Dorian Kodelja,et al.  Multiagent cooperation and competition with deep reinforcement learning , 2015, PloS one.