Flow: A Modular Learning Framework for Mixed Autonomy Traffic

The rapid development of autonomous vehicles (AVs) holds vast potential for transportation systems through improved safety, efficiency, and access to mobility. However, the progression of these impacts, as AVs are adopted, is not well understood. Numerous technical challenges arise from the goal of analyzing the partial adoption of autonomy: partial control and observation, multi-vehicle interactions, and the sheer variety of scenarios represented by real-world networks. To shed light into near-term AV impacts, this article studies the suitability of deep reinforcement learning (RL) for overcoming these challenges in a low AV-adoption regime. A modular learning framework is presented, which leverages deep RL to address complex traffic dynamics. Modules are composed to capture common traffic phenomena (stop-and-go traffic jams, lane changing, intersections). Learned control laws are found to improve upon human driving performance, in terms of system-level velocity, by up to 57% with only 4-7% adoption of AVs. Furthermore, in single-lane traffic, a small neural network control law with only local observation is found to eliminate stop-and-go traffic – surpassing all known model-based controllers to achieve near-optimal performance – and generalize to out-of-distribution traffic densities.

[1]  Charles A. Desoer,et al.  A SYSTEM LEVEL STUDY OF THE LONGITUDINAL CONTROL OF A PLATOON OF VEHICLES , 1992 .

[2]  Alan J. Miller A Queueing Model for Road Traffic Flow , 1961 .

[3]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[4]  Jun-ichi Imura,et al.  Smart Driving of a Vehicle Using Model Predictive Control for Improving Traffic Flow , 2014, IEEE Transactions on Intelligent Transportation Systems.

[5]  Axel Klar,et al.  A Hierarchy of Models for Multilane Vehicular Traffic I: Modeling , 1998, SIAM J. Appl. Math..

[6]  Martin Treiber,et al.  The Intelligent Driver Model with Stochasticity -New Insights Into Traffic Flow Oscillations , 2017 .

[7]  P. I. Richards Shock Waves on the Highway , 1956 .

[8]  Sameera S. Ponda,et al.  Autonomous navigation of stratospheric balloons using reinforcement learning , 2020, Nature.

[9]  Emilio Frazzoli,et al.  Toward a Systematic Approach to the Design and Evaluation of Automated Mobility-on-Demand Systems: A Case Study in Singapore , 2014 .

[10]  Don MacKenzie,et al.  Help or hindrance? The travel, energy and carbon impacts of highly automated vehicles , 2016 .

[11]  Gábor Stépán,et al.  Traffic jams: dynamics and control , 2010, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[12]  Sertac Karaman,et al.  Polling-Systems-Based Autonomous Vehicle Coordination in Traffic Intersections With No Traffic Signals , 2016, IEEE Transactions on Automatic Control.

[13]  Yuval Tassa,et al.  MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[14]  Yang Zheng,et al.  Smoothing Traffic Flow via Control of Autonomous Vehicles , 2018, IEEE Internet of Things Journal.

[15]  Azim Eskandarian,et al.  Research advances in intelligent collision avoidance and adaptive cruise control , 2003, IEEE Trans. Intell. Transp. Syst..

[16]  Berthold K. P. Horn,et al.  Suppressing traffic flow instabilities , 2013, 16th International IEEE Conference on Intelligent Transportation Systems (ITSC 2013).

[17]  A. Sasoh,et al.  Shock Wave Relation Containing Lane Change Source Term for Two-Lane Traffic Flow , 2002 .

[18]  Alberto Speranzon,et al.  Multiobjective Path Planning: Localization Constraints and Collision Probability , 2015, IEEE Transactions on Robotics.

[19]  Dirk Helbing,et al.  Jam-Avoiding Adaptive Cruise Control (ACC) and its Impact on Traffic Dynamics , 2005 .

[20]  Javier Minguez,et al.  Extending Collision Avoidance Methods to Consider the Vehicle Shape, Kinematics, and Dynamics of a Mobile Robot , 2009, IEEE Transactions on Robotics.

[21]  D Heidemann,et al.  A queueing theory approach to speed-flow-density relationships , 1996 .

[22]  Anca D. Dragan,et al.  Planning for Autonomous Cars that Leverage Effects on Human Actions , 2016, Robotics: Science and Systems.

[23]  Marc G. Bellemare,et al.  The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..

[24]  Bart van Arem,et al.  The Impact of Cooperative Adaptive Cruise Control on Traffic-Flow Characteristics , 2006, IEEE Transactions on Intelligent Transportation Systems.

[25]  Pieter Abbeel,et al.  Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.

[26]  Sebastian Thrun,et al.  Stanley: The robot that won the DARPA Grand Challenge , 2006, J. Field Robotics.

[27]  Huei Peng,et al.  String stability analysis of adaptive cruise controlled vehicles , 2000 .

[28]  Eleni I. Vlahogianni,et al.  Statistical methods versus neural networks in transportation research: Differences, similarities and some insights , 2011 .

[29]  Nakayama,et al.  Dynamical model of traffic congestion and numerical simulation. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[30]  Dirk Helbing,et al.  Delays, inaccuracies and anticipation in microscopic traffic models , 2006 .

[31]  Gábor Orosz,et al.  Dynamics of connected vehicle systems with delayed acceleration feedback , 2014 .

[32]  Alexandre M. Bayen,et al.  Dissipating stop-and-go waves in closed and open networks via deep reinforcement learning , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[33]  Jinwoo Lee,et al.  A probability model for discretionary lane changes in highways , 2016 .

[34]  Ronald A. Howard,et al.  Dynamic Programming and Markov Processes , 1960 .

[35]  Michel Rascle,et al.  Resurrection of "Second Order" Models of Traffic Flow , 2000, SIAM J. Appl. Math..

[36]  Nico Vandaele,et al.  Modeling Traffic Flows with Queueing Models: a Review , 2007, Asia Pac. J. Oper. Res..

[37]  Sergey V. Drakunov,et al.  ABS control using optimum search via sliding modes , 1995, IEEE Trans. Control. Syst. Technol..

[38]  Petros A. Ioannou,et al.  Evaluation of ACC vehicles in mixed traffic: lane change effects and sensitivity analysis , 2005, IEEE Transactions on Intelligent Transportation Systems.

[39]  Keith Redmill,et al.  Automated lane change controller design , 2003, IEEE Trans. Intell. Transp. Syst..

[40]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[41]  J.K. Hedrick,et al.  Heavy-duty truck control: short inter-vehicle distance following , 2004, Proceedings of the 2004 American Control Conference.

[42]  Xin Zhang,et al.  End to End Learning for Self-Driving Cars , 2016, ArXiv.

[43]  Peter Stone,et al.  A Protocol for Mixed Autonomous and Human-Operated Vehicles at Intersections , 2017, AAMAS Workshops.

[44]  Pieter Abbeel,et al.  Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[45]  P Spaulding,et al.  NATIONAL TRANSPORTATION STATISTICS , 1983 .

[46]  Kyongsu Yi,et al.  Lane-keeping assistance control algorithm using differential braking to prevent unintended lane departures , 2014 .

[47]  Hugh F. Durrant-Whyte,et al.  A high integrity IMU/GPS navigation loop for autonomous land vehicle applications , 1999, IEEE Trans. Robotics Autom..

[48]  Benjamin Seibold,et al.  Stabilizing traffic flow via a single autonomous vehicle: Possibilities and limitations , 2017, 2017 IEEE Intelligent Vehicles Symposium (IV).

[49]  Stefan Schaal,et al.  Robot Learning From Demonstration , 1997, ICML.

[50]  Alexandre M. Bayen,et al.  Stabilizing Traffic with Autonomous Vehicles , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[51]  Chung Choo Chung,et al.  Robust Multirate Control Scheme With Predictive Virtual Lanes for Lane-Keeping System of Autonomous Highway Driving , 2015, IEEE Transactions on Vehicular Technology.

[52]  Martin Treiber,et al.  Traffic Flow Dynamics , 2013 .

[53]  Alexandre M. Bayen,et al.  Multi-lane reduction: A stochastic single-lane model for lane changing , 2017, 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC).

[54]  Petros A. Ioannou,et al.  Analysis of traffic flow with mixed manual and semiautomated vehicles , 2003, IEEE Trans. Intell. Transp. Syst..

[55]  Y. Sugiyama,et al.  Traffic jams without bottlenecks—experimental evidence for the physical mechanism of the formation of a jam , 2008 .

[56]  Alexandre M. Bayen,et al.  Emergent Behaviors in Mixed-Autonomy Traffic , 2017, CoRL.

[57]  Emilio Frazzoli,et al.  Robotic load balancing for mobility-on-demand systems , 2012, Int. J. Robotics Res..

[58]  Gábor Orosz,et al.  Delayed car-following dynamics for human and robotic drivers , 2011 .

[59]  Fei-Yue Wang,et al.  Traffic Flow Prediction With Big Data: A Deep Learning Approach , 2015, IEEE Transactions on Intelligent Transportation Systems.

[60]  R. Bellman A Markovian Decision Process , 1957 .

[61]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[62]  Helbing,et al.  Congested traffic states in empirical observations and microscopic simulations , 2000, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[63]  Behdad Chalaki,et al.  Simulation to scaled city: zero-shot policy transfer for traffic control via autonomous vehicles , 2018, ICCPS.

[64]  Zhenhui Li,et al.  IntelliLight: A Reinforcement Learning Approach for Intelligent Traffic Light Control , 2018, KDD.

[65]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[66]  Andreas A. Malikopoulos,et al.  A Survey on the Coordination of Connected and Automated Vehicles at Intersections and Merging at Highway On-Ramps , 2017, IEEE Transactions on Intelligent Transportation Systems.

[67]  A. Fuze,et al.  Reconstruction of 3-D Road Geometry from Images for Autonomous Land Vehicles , 1990 .

[68]  Daniel Krajzewicz,et al.  Recent Development and Applications of SUMO - Simulation of Urban MObility , 2012 .

[69]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[70]  Sanjiv Singh,et al.  The DARPA Urban Challenge: Autonomous Vehicles in City Traffic, George Air Force Base, Victorville, California, USA , 2009, The DARPA Urban Challenge.

[71]  Zvi Shiller,et al.  Dynamic motion planning of autonomous vehicles , 1991, IEEE Trans. Robotics Autom..

[72]  Alexandre M. Bayen,et al.  Benchmarks for reinforcement learning in mixed-autonomy traffic , 2018, CoRL.

[73]  Kanok Boriboonsomsin,et al.  Energy and emissions impacts of a freeway-based dynamic eco-driving system , 2009 .

[74]  Yoshua Bengio,et al.  Gated Feedback Recurrent Neural Networks , 2015, ICML.

[75]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[76]  Peter Stone,et al.  A Multiagent Approach to Autonomous Intersection Management , 2008, J. Artif. Intell. Res..

[77]  Ruzena Bajcsy,et al.  Lane Keeping Assistance with Learning-Based Driver Model and Model Predictive Control , 2014 .

[78]  Panos G. Michalopoulos,et al.  Multilane traffic flow dynamics: Some macroscopic considerations , 1984 .

[79]  Harold J Payne,et al.  FREFLO: A MACROSCOPIC SIMULATION MODEL OF FREEWAY TRAFFIC , 1979 .

[80]  Alexandre M. Bayen,et al.  Flow: Architecture and Benchmarking for Reinforcement Learning in Traffic Control , 2017, ArXiv.

[81]  Wang,et al.  Review of road traffic control strategies , 2003, Proceedings of the IEEE.

[82]  Maria Laura Delle Monache,et al.  Dissipation of stop-and-go waves via control of autonomous vehicles: Field experiments , 2017, ArXiv.

[83]  S E Shladover,et al.  Automated vehicles for highway operations (automated highway systems) , 2005 .

[84]  M J Lighthill,et al.  On kinematic waves II. A theory of traffic flow on long crowded roads , 1955, Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences.

[85]  Rajesh Rajamani,et al.  Semi-autonomous adaptive cruise control systems , 2002, IEEE Trans. Veh. Technol..

[86]  Shun Zhang,et al.  Semi-autonomous intersection management , 2014, AAMAS.

[87]  Ion Stoica,et al.  Ray RLLib: A Composable and Scalable Reinforcement Learning Library , 2017, NIPS 2017.

[88]  Li Li,et al.  Traffic signal timing via deep reinforcement learning , 2016, IEEE/CAA Journal of Automatica Sinica.

[89]  Petros A. Ioannou,et al.  Autonomous intelligent cruise control , 1993 .

[90]  Nicholas G. Polson,et al.  Deep learning for short-term traffic flow prediction , 2016, 1604.04527.

[91]  Shuzhi Sam Ge,et al.  Autonomous vehicle positioning with GPS in urban canyon environments , 2001, IEEE Trans. Robotics Autom..

[92]  Yuval Tassa,et al.  Learning Continuous Control Policies by Stochastic Value Gradients , 2015, NIPS.

[93]  Marco Pavone,et al.  Control of robotic mobility-on-demand systems: A queueing-theoretical perspective , 2014, Int. J. Robotics Res..

[94]  Mao-Bin Hu,et al.  Traffic Flow Characteristics in a Mixed Traffic System Consisting of ACC Vehicles and Manual Vehicles: A Hybrid Modeling Approach , 2009 .

[95]  Nan Xu,et al.  CoLight: Learning Network-level Cooperation for Traffic Signal Control , 2019, CIKM.

[96]  Sergey Levine,et al.  End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[97]  Amnon Shashua,et al.  Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving , 2016, ArXiv.

[98]  Alexandre M. Bayen,et al.  Lagrangian Control through Deep-RL: Applications to Bottleneck Decongestion , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[99]  Vicente Milanés Montero,et al.  Intelligent automatic overtaking system using vision for vehicle detection , 2012, Expert Syst. Appl..

[100]  Alexandre M. Bayen,et al.  Expert Level Control of Ramp Metering Based on Multi-Task Deep Reinforcement Learning , 2017, IEEE Transactions on Intelligent Transportation Systems.

[101]  Liang Wang,et al.  Eigenvalue and Eigenvector Analysis of Stability for a Line of Traffic , 2017 .

[102]  Andreas A. Malikopoulos,et al.  Automated and Cooperative Vehicle Merging at Highway On-Ramps , 2017, IEEE Transactions on Intelligent Transportation Systems.

[103]  K. Hasebe,et al.  Structure stability of congestion in traffic dynamics , 1994 .

[104]  Carlos F. Daganzo,et al.  A BEHAVIORAL THEORY OF MULTI-LANE TRAFFIC FLOW. PART I, LONG HOMOGENEOUS FREEWAY SECTIONS , 1999 .

[105]  Christian Laugier,et al.  Motion generation and control for parking an autonomous vehicle , 1996, Proceedings of IEEE International Conference on Robotics and Automation.

[106]  D. Swaroop,et al.  String Stability Of Interconnected Systems: An Application To Platooning In Automated Highway Systems , 1997 .

[107]  Sergey Levine,et al.  High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.

[108]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[109]  Alexandre M. Bayen,et al.  Framework for control and deep reinforcement learning in traffic , 2017, 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC).

[110]  P. G. Gipps,et al.  A behavioural car-following model for computer simulation , 1981 .

[111]  Hugh F. Durrant-Whyte,et al.  A solution to the simultaneous localization and map building (SLAM) problem , 2001, IEEE Trans. Robotics Autom..