Learning to Drive using Inverse Reinforcement Learning and Deep Q-Networks

We propose an inverse reinforcement learning (IRL) approach using Deep Q-Networks to extract the rewards in problems with large state spaces. We evaluate the performance of this approach in a simulation-based autonomous driving scenario. Our results resemble the intuitive relation between the reward function and readings of distance sensors mounted at different poses on the car. We also show that, after a few learning rounds, our simulated agent generates collision-free motions and performs human-like lane change behaviour.

[1]  Ben J. A. Kröse,et al.  Learning from delayed rewards , 1995, Robotics Auton. Syst..

[2]  John N. Tsitsiklis,et al.  Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.

[3]  Andrew Y. Ng,et al.  Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[4]  Pieter Abbeel,et al.  Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[5]  J. Andrew Bagnell,et al.  Maximum margin planning , 2006, ICML.

[6]  Steven M. LaValle,et al.  Planning algorithms , 2006 .

[7]  Eyal Amir,et al.  Bayesian Inverse Reinforcement Learning , 2007, IJCAI.

[8]  Anind K. Dey,et al.  Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.

[9]  Julius Ziegler,et al.  Optimal trajectory generation for dynamic street scenarios in a Frenét Frame , 2010, 2010 IEEE International Conference on Robotics and Automation.

[10]  Rachid Alami,et al.  Human-aware robot navigation: A survey , 2013, Robotics Auton. Syst..

[11]  Seiichi Mita,et al.  Evaluating human & computer for expressway lane changing , 2014, 2014 IEEE Intelligent Vehicles Symposium Proceedings.

[12]  Seiichi Mita,et al.  General behavior and motion model for automated lane change , 2015, 2015 IEEE Intelligent Vehicles Symposium (IV).

[13]  Markus Wulfmeier,et al.  Maximum Entropy Deep Inverse Reinforcement Learning , 2015, 1507.04888.

[14]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[15]  Daniel Cremers,et al.  SPENCER: A Socially Aware Service Robot for Passenger Guidance and Help in Busy Airports , 2015, FSR.

[16]  Markus Wulfmeier,et al.  Deep Inverse Reinforcement Learning , 2015, ArXiv.

[17]  Xin Zhang,et al.  End to End Learning for Self-Driving Cars , 2016, ArXiv.

[18]  Kai Oliver Arras,et al.  Learning socially normative robot navigation behaviors with Bayesian inverse reinforcement learning , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).