论文信息 - Flow: A Modular Learning Framework for Autonomy in Traffic.

Flow: A Modular Learning Framework for Autonomy in Traffic.

The rapid development of autonomous vehicles (AVs) holds vast potential for transportation systems through improved safety, efficiency, and access to mobility. However, due to numerous technical, political, and human factors challenges, new methodologies are needed to design vehicles and transportation systems for these positive outcomes. This article tackles important technical challenges arising from the partial adoption of autonomy (hence termed mixed autonomy, to involve both AVs and human-driven vehicles): partial control, partial observation, complex multi-vehicle interactions, and the sheer variety of traffic settings represented by real-world networks. To enable the study of the full diversity of traffic settings, we first propose to decompose traffic control tasks into modules, which may be configured and composed to create new control tasks of interest. These modules include salient aspects of traffic control tasks: networks, actors, control laws, metrics, initialization, and additional dynamics. Second, we study the potential of model-free deep Reinforcement Learning (RL) methods to address the complexity of traffic dynamics. The resulting modular learning framework is called Flow. Using Flow, we create and study a variety of mixed-autonomy settings, including single-lane, multi-lane, and intersection traffic. In all cases, the learned control law exceeds human driving performance (measured by system-level velocity) by at least 40% with only 5-10% adoption of AVs. In the case of partially-observed single-lane traffic, we show that a low-parameter neural network control law can eliminate commonly observed stop-and-go traffic. In particular, the control laws surpass all known model-based controllers, achieving near-optimal performance across a wide spectrum of vehicle densities (even with a memoryless control law) and generalizing to out-of-distribution vehicle densities.

[1] Bart van Arem,et al. The Impact of Cooperative Adaptive Cruise Control on Traffic-Flow Characteristics , 2006, IEEE Transactions on Intelligent Transportation Systems.

[2] Zvi Shiller,et al. Dynamic motion planning of autonomous vehicles , 1991, IEEE Trans. Robotics Autom..

[3] Martin Treiber,et al. The Intelligent Driver Model with Stochasticity -New Insights Into Traffic Flow Oscillations , 2017 .

[4] Axel Klar,et al. A Hierarchy of Models for Multilane Vehicular Traffic I: Modeling , 1998, SIAM J. Appl. Math..

[5] Michel Rascle,et al. Resurrection of "Second Order" Models of Traffic Flow , 2000, SIAM J. Appl. Math..

[6] Wang,et al. Review of road traffic control strategies , 2003, Proceedings of the IEEE.

[7] Martin Treiber,et al. Traffic Flow Dynamics , 2013 .

[8] Anca D. Dragan,et al. Planning for Autonomous Cars that Leverage Effects on Human Actions , 2016, Robotics: Science and Systems.

[9] Jun-ichi Imura,et al. Smart Driving of a Vehicle Using Model Predictive Control for Improving Traffic Flow , 2014, IEEE Transactions on Intelligent Transportation Systems.

[10] R. Bellman,et al. Dynamic Programming and Markov Processes , 1960 .

[11] Sanjiv Singh,et al. The DARPA Urban Challenge: Autonomous Vehicles in City Traffic, George Air Force Base, Victorville, California, USA , 2009, The DARPA Urban Challenge.

[12] Shun Zhang,et al. Semi-autonomous intersection management , 2014, AAMAS.

[13] Alexandre M. Bayen,et al. Expert Level Control of Ramp Metering Based on Multi-Task Deep Reinforcement Learning , 2017, IEEE Transactions on Intelligent Transportation Systems.

[14] Nicholas G. Polson,et al. Deep learning for short-term traffic flow prediction , 2016, 1604.04527.

[15] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.

[16] Helbing,et al. Congested traffic states in empirical observations and microscopic simulations , 2000, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[17] K. Kanatani,et al. Reconstruction of 3-D road geometry from images for autonomous land vehicles , 1990, IEEE Trans. Robotics Autom..

[18] Li Li,et al. Traffic signal timing via deep reinforcement learning , 2016, IEEE/CAA Journal of Automatica Sinica.

[19] P. G. Gipps,et al. A behavioural car-following model for computer simulation , 1981 .

[20] Hugh F. Durrant-Whyte,et al. A solution to the simultaneous localization and map building (SLAM) problem , 2001, IEEE Trans. Robotics Autom..

[21] Behdad Chalaki,et al. Simulation to scaled city: zero-shot policy transfer for traffic control via autonomous vehicles , 2018, ICCPS.

[22] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[23] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[24] Nan Xu,et al. CoLight: Learning Network-level Cooperation for Traffic Signal Control , 2019, CIKM.

[25] Alexandre M. Bayen,et al. Emergent Behaviors in Mixed-Autonomy Traffic , 2017, CoRL.

[26] Mauro Garavello,et al. Traffic flow on networks : conservation laws models , 2006 .

[27] Eleni I. Vlahogianni,et al. Statistical methods versus neural networks in transportation research: Differences, similarities and some insights , 2011 .

[28] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.

[29] Alexandre M. Bayen,et al. Dissipating stop-and-go waves in closed and open networks via deep reinforcement learning , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[30] Sebastian Thrun,et al. Stanley: The robot that won the DARPA Grand Challenge , 2006, J. Field Robotics.

[31] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[32] Peter Stone,et al. A Multiagent Approach to Autonomous Intersection Management , 2008, J. Artif. Intell. Res..

[33] Daniel Krajzewicz,et al. Recent Development and Applications of SUMO - Simulation of Urban MObility , 2012 .

[34] R. Bellman. A Markovian Decision Process , 1957 .

[35] Fei-Yue Wang,et al. Traffic Flow Prediction With Big Data: A Deep Learning Approach , 2015, IEEE Transactions on Intelligent Transportation Systems.

[36] Harold J Payne,et al. FREFLO: A MACROSCOPIC SIMULATION MODEL OF FREEWAY TRAFFIC , 1979 .

[37] A. Sasoh,et al. Shock Wave Relation Containing Lane Change Source Term for Two-Lane Traffic Flow , 2002 .

[38] Yoshua Bengio,et al. Gated Feedback Recurrent Neural Networks , 2015, ICML.

[39] Nakayama,et al. Dynamical model of traffic congestion and numerical simulation. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[40] Y. Sugiyama,et al. Traffic jams without bottlenecks—experimental evidence for the physical mechanism of the formation of a jam , 2008 .

[41] Azim Eskandarian,et al. Research advances in intelligent collision avoidance and adaptive cruise control , 2003, IEEE Trans. Intell. Transp. Syst..

[42] A E Pisarski,et al. NATIONAL TRANSPORTATION STATISTICS , 2000 .

[43] Dirk Helbing,et al. Jam-Avoiding Adaptive Cruise Control (ACC) and its Impact on Traffic Dynamics , 2005 .

[44] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..

[45] Keith Redmill,et al. Automated lane change controller design , 2003, IEEE Trans. Intell. Transp. Syst..

[46] Shane Legg,et al. DeepMind Lab , 2016, ArXiv.

[47] Nico Vandaele,et al. Modeling Traffic Flows with Queueing Models: a Review , 2007, Asia Pac. J. Oper. Res..

[48] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[49] Dirk Helbing,et al. Multi-anticipative driving in microscopic traffic models , 2004 .

[50] Ruzena Bajcsy,et al. Lane Keeping Assistance with Learning-Based Driver Model and Model Predictive Control , 2014 .

[51] Charles A. Desoer,et al. A SYSTEM LEVEL STUDY OF THE LONGITUDINAL CONTROL OF A PLATOON OF VEHICLES , 1992 .

[52] Sertac Karaman,et al. Polling-systems-based control of high-performance provably-safe autonomous intersections , 2014, 53rd IEEE Conference on Decision and Control.

[53] Rajesh Rajamani,et al. Semi-autonomous adaptive cruise control systems , 2002, IEEE Trans. Veh. Technol..

[54] Peter Stone,et al. A Protocol for Mixed Autonomous and Human-Operated Vehicles at Intersections , 2017, AAMAS Workshops.

[55] Yuval Tassa,et al. Learning Continuous Control Policies by Stochastic Value Gradients , 2015, NIPS.

[56] Liang Wang,et al. Eigenvalue and Eigenvector Analysis of Stability for a Line of Traffic , 2017 .

[57] Petros A. Ioannou,et al. Analysis of traffic flow with mixed manual and semiautomated vehicles , 2003, IEEE Trans. Intell. Transp. Syst..

[58] D Heidemann,et al. A queueing theory approach to speed-flow-density relationships , 1996 .

[59] Amnon Shashua,et al. Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving , 2016, ArXiv.

[60] Gábor Stépán,et al. Traffic jams: dynamics and control , 2010, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[61] Marco Pavone,et al. Control of robotic mobility-on-demand systems: A queueing-theoretical perspective , 2014, Int. J. Robotics Res..

[62] Maria Laura Delle Monache,et al. Dissipation of stop-and-go waves via control of autonomous vehicles: Field experiments , 2017, ArXiv.

[63] Carlos F. Daganzo,et al. A BEHAVIORAL THEORY OF MULTI-LANE TRAFFIC FLOW. PART I, LONG HOMOGENEOUS FREEWAY SECTIONS , 1999 .

[64] Alexandre M. Bayen,et al. Lagrangian Control through Deep-RL: Applications to Bottleneck Decongestion , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[65] Ion Stoica,et al. Ray RLLib: A Composable and Scalable Reinforcement Learning Library , 2017, NIPS 2017.

[66] Stefan Schaal,et al. Robot Learning From Demonstration , 1997, ICML.

[67] Hugh F. Durrant-Whyte,et al. A high integrity IMU/GPS navigation loop for autonomous land vehicle applications , 1999, IEEE Trans. Robotics Autom..

[68] Christos Dimitrakakis,et al. TORCS, The Open Racing Car Simulator , 2005 .

[69] Alexandre M. Bayen,et al. Benchmarks for reinforcement learning in mixed-autonomy traffic , 2018, CoRL.

[70] Alexandre M. Bayen,et al. Framework for control and deep reinforcement learning in traffic , 2017, 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC).

[71] Gábor Orosz,et al. Dynamics of connected vehicle systems with delayed acceleration feedback , 2014 .

[72] Simon Haykin,et al. Neural Networks: A Comprehensive Foundation , 1998 .

[73] Alan J. Miller. A Queueing Model for Road Traffic Flow , 1961 .

[74] Gábor Orosz,et al. Delayed car-following dynamics for human and robotic drivers , 2011 .

[75] J.K. Hedrick,et al. Heavy-duty truck control: short inter-vehicle distance following , 2004, Proceedings of the 2004 American Control Conference.

[76] Carla E. Brodley,et al. Proceedings of the twenty-first international conference on Machine learning , 2004, International Conference on Machine Learning.

[77] Emilio Frazzoli,et al. Robotic load balancing for mobility-on-demand systems , 2012, Int. J. Robotics Res..

[78] Huei Peng,et al. String stability analysis of adaptive cruise controlled vehicles , 2000 .

[79] Zhenhui Li,et al. IntelliLight: A Reinforcement Learning Approach for Intelligent Traffic Light Control , 2018, KDD.

[80] K. Hasebe,et al. Structure stability of congestion in traffic dynamics , 1994 .

[81] Petros A. Ioannou,et al. Evaluation of ACC vehicles in mixed traffic: lane change effects and sensitivity analysis , 2005, IEEE Transactions on Intelligent Transportation Systems.

[82] Alexandre M. Bayen,et al. Multi-lane reduction: A stochastic single-lane model for lane changing , 2017, 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC).

[83] Andreas A. Malikopoulos,et al. A Survey on the Coordination of Connected and Automated Vehicles at Intersections and Merging at Highway On-Ramps , 2017, IEEE Transactions on Intelligent Transportation Systems.

[84] Andreas A. Malikopoulos,et al. Automated and Cooperative Vehicle Merging at Highway On-Ramps , 2017, IEEE Transactions on Intelligent Transportation Systems.

[85] P. I. Richards. Shock Waves on the Highway , 1956 .

[86] JOHN F. B. Mitchell,et al. National Transportation Statistics (Annual Report, 1983) , 1983 .

[87] Christian Laugier,et al. Motion generation and control for parking an autonomous vehicle , 1996, Proceedings of IEEE International Conference on Robotics and Automation.

[88] Kanok Boriboonsomsin,et al. Energy and emissions impacts of a freeway-based dynamic eco-driving system , 2009 .

[89] Florian Richoux,et al. TorchCraft: a Library for Machine Learning Research on Real-Time Strategy Games , 2016, ArXiv.

[90] Panos G. Michalopoulos,et al. Multilane traffic flow dynamics: Some macroscopic considerations , 1984 .

[91] Javier Minguez,et al. Extending Collision Avoidance Methods to Consider the Vehicle Shape, Kinematics, and Dynamics of a Mobile Robot , 2009, IEEE Transactions on Robotics.

[92] Sergey V. Drakunov,et al. ABS control using optimum search via sliding modes , 1995, IEEE Trans. Control. Syst. Technol..

[93] Benjamin Seibold,et al. Stabilizing traffic flow via a single autonomous vehicle: Possibilities and limitations , 2017, 2017 IEEE Intelligent Vehicles Symposium (IV).

[94] Alexandre M. Bayen,et al. Flow: Architecture and Benchmarking for Reinforcement Learning in Traffic Control , 2017, ArXiv.

[95] Alberto Speranzon,et al. Multiobjective Path Planning: Localization Constraints and Collision Probability , 2015, IEEE Transactions on Robotics.

[96] D. Swaroop,et al. String Stability Of Interconnected Systems: An Application To Platooning In Automated Highway Systems , 1997 .

[97] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.

[98] Berthold K. P. Horn,et al. Suppressing traffic flow instabilities , 2013, 16th International IEEE Conference on Intelligent Transportation Systems (ITSC 2013).

[99] S E Shladover,et al. Automated vehicles for highway operations (automated highway systems) , 2005 .

[100] Alexandre M. Bayen,et al. Stabilizing Traffic with Autonomous Vehicles , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[101] Petros A. Ioannou,et al. Autonomous intelligent cruise control , 1993 .

[102] Jinwoo Lee,et al. A probability model for discretionary lane changes in highways , 2016 .

[103] M J Lighthill,et al. On kinematic waves II. A theory of traffic flow on long crowded roads , 1955, Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences.

[104] Xin Zhang,et al. End to End Learning for Self-Driving Cars , 2016, ArXiv.

[105] Vicente Milanés Montero,et al. Intelligent automatic overtaking system using vision for vehicle detection , 2012, Expert Syst. Appl..

[106] Emilio Frazzoli,et al. Toward a Systematic Approach to the Design and Evaluation of Automated Mobility-on-Demand Systems: A Case Study in Singapore , 2014 .

[107] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.

[108] Don MacKenzie,et al. Help or hindrance? The travel, energy and carbon impacts of highly automated vehicles , 2016 .

[109] Mao-Bin Hu,et al. Traffic Flow Characteristics in a Mixed Traffic System Consisting of ACC Vehicles and Manual Vehicles: A Hybrid Modeling Approach , 2009 .

[110] Chung Choo Chung,et al. Robust Multirate Control Scheme With Predictive Virtual Lanes for Lane-Keeping System of Autonomous Highway Driving , 2015, IEEE Transactions on Vehicular Technology.

[111] Matthew J. Hausknecht,et al. TextWorld: A Learning Environment for Text-based Games , 2018, CGW@IJCAI.

[112] Kyongsu Yi,et al. Lane-keeping assistance control algorithm using differential braking to prevent unintended lane departures , 2014 .

[113] Shuzhi Sam Ge,et al. Autonomous vehicle positioning with GPS in urban canyon environments , 2001, IEEE Trans. Robotics Autom..

[114] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[115] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .