Intersection Navigation Under Dynamic Constraints Using Deep Reinforcement Learning

In this study, we present a unified motion planner with low- level controller for continuous control of a differential drive mobile robot. Deep reinforcement agent takes 10 dimensional state vector as input and calculates each wheel’s torque value as a 2 dimensional output vector. These torque values are fed into the dynamic model of the robot, and lastly steering commands are gathered. In previous studies, navigation problem solutions that uses deep - RL methods, have not been considered with agent’s own dynamic constraints, but it has been done by only considering kinematic models. This is not reliable enough for real-world scenarios. In this paper, deep-RL based motion planning is performed by considering both kinematic and dynamic constraints. According to the simulations in a dynamic environment, the agent succesfully navigates through the intersection with 99.6% success rate.

[1]  Andrew Howard,et al.  Design and use paradigms for Gazebo, an open-source multi-robot simulator , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[2]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[3]  Wolfram Burgard,et al.  Probabilistic Robotics (Intelligent Robotics and Autonomous Agents) , 2005 .

[4]  Ming Liu,et al.  Virtual-to-real deep reinforcement learning: Continuous control of mobile robots for mapless navigation , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[5]  Mykel J. Kochenderfer,et al.  Belief state planning for autonomously navigating urban intersections , 2017, 2017 IEEE Intelligent Vehicles Symposium (IV).

[6]  David Isele,et al.  Transferring Autonomous Driving Knowledge on Simulated and Real Intersections , 2017, ArXiv.

[7]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[8]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[9]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[10]  小寺 武康,et al.  On the theory of the Brownian motion , 1959 .

[11]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[12]  G. Uhlenbeck,et al.  On the Theory of the Brownian Motion , 1930 .

[13]  Intelligent decision making for overtaking maneuver using mixed observable Markov decision process , 2018, J. Intell. Transp. Syst..

[14]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[15]  R. Bellman Dynamic programming. , 1957, Science.

[16]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[17]  Régis Sabbadin,et al.  Possibilistic Markov decision processes , 2001 .