Highway Traffic Modeling and Decision Making for Autonomous Vehicle Using Reinforcement Learning

This paper studies the decision making problem of autonomous vehicles in traffic. We model the interaction between an autonomous vehicle and the environment as a stochastic Markov decision process (MDP) and consider the driving style of an experienced driver as the target to be learned. The road geometry is taken into consideration in the MDP model in order to incorporate more diverse driving styles. By designing the reward function of the MDP, the desired, driving behavior of the autonomous vehicle is obtained using reinforcement learning. Simulated results demonstrate the desired driving behaviors of an autonomous vehicle.

[1]  Rüdiger Dillmann,et al.  Probabilistic MDP-behavior planning for cars , 2011, 2011 14th International IEEE Conference on Intelligent Transportation Systems (ITSC).

[2]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[3]  H. R. Berenji,et al.  Fuzzy Q-learning: a new approach for fuzzy dynamic programming , 1994, Proceedings of 1994 IEEE 3rd International Fuzzy Systems Conference.

[4]  Nico Kaempchen,et al.  Strategic Decision-Making Process in Advanced Driver Assistance Systems , 2010 .

[5]  Markus Wulfmeier,et al.  Maximum Entropy Deep Inverse Reinforcement Learning , 2015, 1507.04888.

[6]  Sergey Levine,et al.  Continuous Inverse Optimal Control with Locally Optimal Examples , 2012, ICML.

[7]  Ilya V. Kolmanovsky,et al.  Hierarchical reasoning game theory based approach for evaluation and testing of autonomous vehicle control systems , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).

[8]  Jonathan P. How,et al.  Real-Time Motion Planning With Applications to Autonomous Urban Driving , 2009, IEEE Transactions on Control Systems Technology.

[9]  Chris Watkins,et al.  Learning from delayed rewards , 1989 .

[10]  Andrew Y. Ng,et al.  Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[11]  Markus Maurer,et al.  Probabilistic online POMDP decision making for lane changes in fully automated driving , 2013, 16th International IEEE Conference on Intelligent Transportation Systems (ITSC 2013).

[12]  Pieter Abbeel,et al.  Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[13]  Sergey Levine,et al.  Nonlinear Inverse Reinforcement Learning with Gaussian Processes , 2011, NIPS.

[14]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[15]  Trevor Hastie,et al.  Overview of Supervised Learning , 2001 .

[16]  Marcelo H. Ang,et al.  Perception, Planning, Control, and Coordination for Autonomous Vehicles , 2017 .

[17]  Gabriel Hugh Elkaim,et al.  Contin uous Curvature Path Generation Based on Bezier Curves for Autonomous Vehicles , 2010 .

[18]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[19]  Christos Katrakazas,et al.  Real-time motion planning methods for autonomous on-road driving: State-of-the-art and future research directions , 2015 .

[20]  R. Curry,et al.  Path Planning Based on Bézier Curve for Autonomous Ground Vehicles , 2008, Advances in Electrical and Electronics Engineering - IAENG Special Edition of the World Congress on Engineering and Computer Science 2008.

[21]  Ilya V. Kolmanovsky,et al.  A game theoretical model of traffic with multiple interacting drivers for use in autonomous vehicle development , 2016, 2016 American Control Conference (ACC).

[22]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[23]  Emilio Frazzoli,et al.  Sampling-based algorithms for optimal motion planning , 2011, Int. J. Robotics Res..

[24]  Martin A. Riedmiller,et al.  Autonomous reinforcement learning on raw visual input data in a real world application , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[25]  Amnon Shashua,et al.  Long-term Planning by Short-term Prediction , 2016, ArXiv.

[26]  Dean A. Pomerleau,et al.  PANS: a portable navigation platform , 1995, Proceedings of the Intelligent Vehicles '95. Symposium.

[27]  Takeo Kanade,et al.  Vision and Navigation for the Carnegie-Mellon Navlab , 1987 .

[28]  Anind K. Dey,et al.  Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.

[29]  R. Bellman A Markovian Decision Process , 1957 .

[30]  Emilio Frazzoli,et al.  A Survey of Motion Planning and Control Techniques for Self-Driving Urban Vehicles , 2016, IEEE Transactions on Intelligent Vehicles.