Navigating Occluded Intersections with Autonomous Vehicles Using Deep Reinforcement Learning

Providing an efficient strategy to navigate safely through unsignaled intersections is a difficult task that requires determining the intent of other drivers. We explore the effectiveness of Deep Reinforcement Learning to handle intersection problems. Using recent advances in Deep RL, we are able to learn policies that surpass the performance of a commonly-used heuristic approach in several metrics including task completion time and goal success rate and have limited ability to generalize. We then explore a system's ability to learn active sensing behaviors to enable navigating safely in the case of occlusions. Our analysis, provides insight into the intersection handling problem, the solutions learned by the network point out several shortcomings of current rule-based methods, and the failures of our current deep reinforcement learning system point to future research directions.

[1]  Jeroen Hogema,et al.  TIME-TO-COLLISION AND COLLISION AVOIDANCE SYSTEMS , 1994 .

[2]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[3]  Stefan Krauss,et al.  MICROSCOPIC MODELING OF TRAFFIC FLOW: INVESTIGATION OF COLLISION FREE VEHICLE DYNAMICS. , 1998 .

[4]  Helbing,et al.  Congested traffic states in empirical observations and microscopic simulations , 2000, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[5]  M M Minderhoud,et al.  Extended time-to-collision measures for road traffic safety assessment. , 2001, Accident; analysis and prevention.

[6]  Katja Vogel,et al.  A comparison of headway and time to collision as safety indicators. , 2003, Accident; analysis and prevention.

[7]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[8]  Long Ji Lin,et al.  Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.

[9]  Jing Peng,et al.  Incremental multi-step Q-learning , 1994, Machine Learning.

[10]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[11]  Christopher R. Baker,et al.  A reasoning framework for autonomous urban driving , 2008, 2008 IEEE Intelligent Vehicles Symposium.

[12]  Julius Ziegler,et al.  Team AnnieWAY's Autonomous System for the DARPA Urban Challenge 2007 , 2009, The DARPA Urban Challenge.

[13]  William Whittaker,et al.  Autonomous driving in urban environments: Boss and the Urban Challenge , 2008, J. Field Robotics.

[14]  Luke Fletcher,et al.  The MIT - Cornell Collision and Why It Happened , 2009, The DARPA Urban Challenge.

[15]  Joshué Pérez,et al.  Autonomous vehicle control systems for safe crossroads , 2011 .

[16]  Daniel Krajzewicz,et al.  Recent Development and Applications of SUMO - Simulation of Urban MObility , 2012 .

[17]  Claire J. Tomlin,et al.  Reducing Conservativeness in Safety Guarantees by Learning Disturbances Online: Iterated Guaranteed Safe Online Learning , 2012, Robotics: Science and Systems.

[18]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[19]  Andrew L. Maas Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[20]  Domitilla Del Vecchio,et al.  Cooperative Collision Avoidance at Intersections: Algorithms and Experiments , 2013, IEEE Transactions on Intelligent Transportation Systems.

[21]  Rüdiger Dillmann,et al.  Probabilistic decision-making under uncertainty for autonomous driving using continuous POMDPs , 2014, 17th International IEEE Conference on Intelligent Transportation Systems (ITSC).

[22]  David Silver,et al.  Memory-based control with recurrent neural networks , 2015, ArXiv.

[23]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[24]  Xin Zhang,et al.  End to End Learning for Self-Driving Cars , 2016, ArXiv.

[25]  Sergey Levine,et al.  Learning deep neural network policies with continuous memory states , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[26]  Tom Schaul,et al.  Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.

[27]  Amnon Shashua,et al.  Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving , 2016, ArXiv.

[28]  Weilong Song,et al.  Intention-Aware Autonomous Driving Decision-Making in an Uncontrolled Intersection , 2016 .

[29]  Sergey Levine,et al.  End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[30]  Tom Schaul,et al.  Prioritized Experience Replay , 2015, ICLR.

[31]  Joshua B. Tenenbaum,et al.  Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.

[32]  Balaraman Ravindran,et al.  Dynamic Frame skip Deep Q Network , 2016, ArXiv.

[33]  David Isele,et al.  Transferring Autonomous Driving Knowledge on Simulated and Real Intersections , 2017, ArXiv.

[34]  Shie Mannor,et al.  A Deep Hierarchical Approach to Lifelong Learning in Minecraft , 2016, AAAI.

[35]  Abhinav Gupta,et al.  Learning to push by grasping: Using multiple tasks for effective learning , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[36]  Tom Schaul,et al.  Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.

[37]  Balaraman Ravindran,et al.  Dynamic Action Repetition for Deep Reinforcement Learning , 2017, AAAI.

[38]  Akansel Cosgun,et al.  Towards full automated drive in urban environments: A demonstration in GoMentum Station, California , 2017, 2017 IEEE Intelligent Vehicles Symposium (IV).

[39]  David Isele,et al.  Selective Experience Replay for Lifelong Learning , 2018, AAAI.

[40]  Ufuk Topcu,et al.  Safe Reinforcement Learning via Shielding , 2017, AAAI.