Learning driving styles for autonomous vehicles from demonstration

It is expected that autonomous vehicles capable of driving without human supervision will be released to market within the next decade. For user acceptance, such vehicles should not only be safe and reliable, they should also provide a comfortable user experience. However, individual perception of comfort may vary considerably among users. Whereas some users might prefer sporty driving with high accelerations, others might prefer a more relaxed style. Typically, a large number of parameters such as acceleration profiles, distances to other cars, speed during lane changes, etc., characterize a human driver's style. Manual tuning of these parameters may be a tedious and error-prone task. Therefore, we propose a learning from demonstration approach that allows the user to simply demonstrate the desired style by driving the car manually. We model the individual style in terms of a cost function and use feature-based inverse reinforcement learning to find the model parameters that fit the observed style best. Once the model has been learned, it can be used to efficiently compute trajectories for the vehicle in autonomous mode. We show that our approach is capable of learning cost functions and reproducing different driving styles using data from real drivers.

[1]  David Silver,et al.  Learning Autonomous Driving Styles and Maneuvers from Expert Demonstration , 2012, ISER.

[2]  Kee-Eung Kim,et al.  Bayesian Nonparametric Feature Construction for Inverse Reinforcement Learning , 2013, IJCAI.

[3]  Shilpa Gulati,et al.  A framework for characterization and planning of safe, comfortable, and customizable motion of assistive mobile robots , 2011 .

[4]  Sergey Levine,et al.  Continuous Inverse Optimal Control with Locally Optimal Examples , 2012, ICML.

[5]  Andrew Y. Ng,et al.  Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[6]  Pieter Abbeel,et al.  Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[7]  Wolfram Burgard,et al.  Learning to predict trajectories of cooperatively navigating agents , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[8]  Martin A. Riedmiller,et al.  A direct adaptive method for faster backpropagation learning: the RPROP algorithm , 1993, IEEE International Conference on Neural Networks.

[9]  Anind K. Dey,et al.  Navigate like a cabbie: probabilistic reasoning from observed context-aware behavior , 2008, UbiComp.

[10]  Dean Pomerleau,et al.  Efficient Training of Artificial Neural Networks for Autonomous Navigation , 1991, Neural Computation.

[11]  Ira D. Jacobson,et al.  MODELS OF HUMAN COMFORT IN VEHICLE ENVIRONMENTS , 1980 .

[12]  Miaoliang Zhu,et al.  Modified reward function on abstract features in inverse reinforcement learning , 2010, Journal of Zhejiang University SCIENCE C.

[13]  Martin A. Riedmiller,et al.  Learning to Drive a Real Car in 20 Minutes , 2007, 2007 Frontiers in the Convergence of Bioscience and Information Technologies.

[14]  Michael L. Littman,et al.  Apprenticeship Learning About Multiple Intentions , 2011, ICML.

[15]  M. Mikulincer,et al.  The multidimensional driving style inventory--scale construct and validation. , 2004, Accident; analysis and prevention.

[16]  Anind K. Dey,et al.  Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.

[17]  Wolfram Burgard,et al.  Online generation of kinodynamic trajectories for non-circular omnidirectional robots , 2011, 2011 IEEE International Conference on Robotics and Automation.

[18]  Wolfram Burgard,et al.  Feature-Based Prediction of Trajectories for Socially Compliant Navigation , 2012, Robotics: Science and Systems.

[19]  D. French,et al.  Behavioral correlates of individual differences in road-traffic crash risk: an examination method and findings. , 1993, Psychological bulletin.