论文信息 - Learning How to Dynamically Route Autonomous Vehicles on Shared Roads

Learning How to Dynamically Route Autonomous Vehicles on Shared Roads

Road congestion induces significant costs across the world, and road network disturbances, such as traffic accidents, can cause highly congested traffic patterns. If a planner had control over the routing of all vehicles in the network, they could easily reverse this effect. In a more realistic scenario, we consider a planner that controls autonomous cars, which are a fraction of all present cars. We study a dynamic routing game, in which the route choices of autonomous cars can be controlled and the human drivers react selfishly and dynamically to autonomous cars' actions. As the problem is prohibitively large, we use deep reinforcement learning to learn a policy for controlling the autonomous vehicles. This policy influences human drivers to route themselves in such a way that minimizes congestion on the network. To gauge the effectiveness of our learned policies, we establish theoretical results characterizing equilibria on a network of parallel roads and empirically compare the learned policy results with best possible equilibria. Moreover, we show that in the absence of these policies, high demands and network perturbations would result in large congestion, whereas using the policy greatly decreases the travel times by minimizing the congestion. To the best of our knowledge, this is the first work that employs deep reinforcement learning to reduce congestion by influencing humans' routing decisions in mixed-autonomy traffic.

Dorsa Sadigh | Ramtin Pedarsani | Erdem Biyik | Daniel A. Lazar

[1] Chaitanya Swamy,et al. The effectiveness of Stackelberg strategies and tolls for network congestion games , 2007, SODA '07.

[2] Benjamin Seibold,et al. Stabilizing traffic flow via a single autonomous vehicle: Possibilities and limitations , 2017, 2017 IEEE Intelligent Vehicles Symposium (IV).

[3] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.

[4] Dorsa Sadigh,et al. Maximizing Road Capacity Using Cars that Influence People , 2018, 2018 IEEE Conference on Decision and Control (CDC).

[5] Dorsa Sadigh,et al. The Green Choice: Learning and Influencing Human Decisions on Shared Roads , 2019, 2019 IEEE 58th Conference on Decision and Control (CDC).

[6] Alexandre M. Bayen,et al. A Decision Support System for Evaluating the Impacts of Routing Applications on Urban Mobility , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[7] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[8] Walid Krichene,et al. On Social Optimal Routing Under Selfish Learning , 2018, IEEE Transactions on Control of Network Systems.

[9] Ramtin Pedarsani,et al. Capacity modeling and routing for traffic networks with mixed autonomy , 2017, 2017 IEEE 56th Annual Conference on Decision and Control (CDC).

[10] Jason R. Marden,et al. Revisiting log-linear learning: Asynchrony, completeness and payoff-based implementation , 2010, 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[11] Tim Roughgarden. Stackelberg Scheduling Strategies , 2004, SIAM J. Comput..

[12] Alexandre M. Bayen,et al. On learning how players learn: estimation of learning dynamics in the routing game , 2016, ICCPS 2016.

[13] Alexandre M. Bayen,et al. Stabilizing Traffic with Autonomous Vehicles , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[14] L. Blume. The Statistical Mechanics of Strategic Interaction , 1993 .

[15] William H. Sandholm,et al. Population Games And Evolutionary Dynamics , 2010, Economic learning and social evolution.

[16] Pravin Varaiya,et al. Effect of adaptive and cooperative adaptive cruise control on throughput of signalized arterials , 2017, 2017 IEEE Intelligent Vehicles Symposium (IV).

[17] Alejandro Henao,et al. Impacts of Ridesourcing - Lyft and Uber - on Transportation Including VMT, Mode Replacement, Parking, and Travel Behavior , 2017 .

[18] Roberto Horowitz,et al. Neural-Attention-Based Deep Learning Architectures for Modeling Traffic Dynamics on Lane Graphs , 2019, 2019 IEEE Intelligent Transportation Systems Conference (ITSC).

[19] C. Daganzo. THE CELL TRANSMISSION MODEL.. , 1994 .

[20] 國合會系統管理者,et al. 2017 Annual Report , 2018 .

[21] Ramtin Pedarsani,et al. Routing for Traffic Networks With Mixed Autonomy , 2018, IEEE Transactions on Automatic Control.

[22] Hedi BenAicha,et al. 2017 Annual Report , 2017 .

[23] S. Dafermos. The Traffic Assignment Problem for Multiclass-User Transportation Networks , 1972 .

[24] Alexandre M. Bayen,et al. Stackelberg Routing on Parallel Transportation Networks , 2017 .

[25] D. Hearn,et al. CONVEX PROGRAMMING FORMULATIONS OF THE ASYMMETRIC TRAFFIC ASSIGNMENT PROBLEM , 1984 .

[26] D. Schrank,et al. 2015 Urban Mobility Scorecard , 2015 .

[27] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .

[28] Tim Roughgarden,et al. How bad is selfish routing? , 2002, JACM.

[29] Roberto Horowitz,et al. Can the Presence of Autonomous Vehicles Worsen the Equilibrium State of Traffic Networks? , 2018, 2018 IEEE Conference on Decision and Control (CDC).

[30] Alexandre M. Bayen,et al. Expert Level Control of Ramp Metering Based on Multi-Task Deep Reinforcement Learning , 2017, IEEE Transactions on Intelligent Transportation Systems.

[31] Roberto Horowitz,et al. Freeway traffic flow simulation using the Link Node Cell transmission model , 2009, 2009 American Control Conference.

[32] Dorsa Sadigh,et al. Altruistic Autonomy: Beating Congestion on Shared Roads , 2018, WAFR.

[33] Roberto Horowitz,et al. Behavior of the cell transmission model and effectiveness of ramp metering , 2008 .

[34] Pravin Varaiya,et al. Novel Freeway Traffic Control with Variable Speed Limit and Coordinated Ramp Metering , 2011 .

[35] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[36] Asuman E. Ozdaglar,et al. Value of Information Systems in Routing Games , 2018, ArXiv.

[37] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.