Automatically Generated Curriculum based Reinforcement Learning for Autonomous Vehicles in Urban Environment

We address the problem of learning autonomous driving behaviors in urban intersections using deep reinforcement learning (DRL). DRL has become a popular choice for creating autonomous agents due to its success in various tasks. However, as the problems tackled become more complex, the number of training iterations necessary increase drastically. Curriculum learning has been shown to reduce the required training time and improve the performance of the agent, but creating an optimal curriculum often requires human handcrafting. In this work, we learn a policy for urban intersection crossing using DRL and introduce a method to automatically generate the curriculum for the training process from a candidate set of tasks. We compare the performance of the automatically generated curriculum (AGC) training to those of randomly generated sequences and show that AGC can significantly reduce the training time while achieving similar or better performance.

[1]  John M. Dolan,et al.  Interactive ramp merging planning in autonomous driving: Multi-merging leading PGM (MML-PGM) , 2017, 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC).

[2]  Anca D. Dragan,et al.  Planning for Autonomous Cars that Leverage Effects on Human Actions , 2016, Robotics: Science and Systems.

[3]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[4]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[5]  Rüdiger Dillmann,et al.  Probabilistic decision-making under uncertainty for autonomous driving using continuous POMDPs , 2014, 17th International IEEE Conference on Intelligent Transportation Systems (ITSC).

[6]  Pieter Abbeel,et al.  Reverse Curriculum Generation for Reinforcement Learning , 2017, CoRL.

[7]  David Isele,et al.  Analyzing Knowledge Transfer in Deep Q-Networks for Autonomously Handling Multiple Intersections , 2017, ArXiv.

[8]  Peter Stone,et al.  Sharing the Road: Autonomous Vehicles Meet Human Drivers , 2007, IJCAI.

[9]  Joshua B. Tenenbaum,et al.  Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.

[10]  Andrew W. Moore,et al.  Prioritized sweeping: Reinforcement learning with less data and less time , 2004, Machine Learning.

[11]  Samy Bengio,et al.  Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks , 2015, NIPS.

[12]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[13]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[14]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[15]  John Schulman,et al.  Teacher–Student Curriculum Learning , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[16]  John M. Dolan,et al.  Intention estimation for ramp merging control in autonomous driving , 2017, 2017 IEEE Intelligent Vehicles Symposium (IV).

[17]  Andrew Y. Ng,et al.  Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[18]  Peter Stone,et al.  Automatic Curriculum Graph Generation for Reinforcement Learning Agents , 2017, AAAI.

[19]  John M. Dolan,et al.  Traffic interaction in the urban challenge: Putting boss on its best behavior , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[20]  David N. Lee,et al.  A Theory of Visual Control of Braking Based on Information about Time-to-Collision , 1976, Perception.

[21]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[22]  A. Homaifar,et al.  Fuzzy modeling of drivers' actions at intersections , 2016, 2016 World Automation Congress (WAC).