Socially aware motion planning with deep reinforcement learning

For robotic vehicles to navigate safely and efficiently in pedestrian-rich environments, it is important to model subtle human behaviors and navigation rules (e.g., passing on the right). However, while instinctive to humans, socially compliant navigation is still difficult to quantify due to the stochasticity in people's behaviors. Existing works are mostly focused on using feature-matching techniques to describe and imitate human paths, but often do not generalize well since the feature values can vary from person to person, and even run to run. This work notes that while it is challenging to directly specify the details of what to do (precise mechanisms of human navigation), it is straightforward to specify what not to do (violations of social norms). Specifically, using deep reinforcement learning, this work develops a time-efficient navigation policy that respects common social norms. The proposed method is shown to enable fully autonomous navigation of a robotic vehicle moving at human walking speed in an environment with many pedestrians.

[1]  Helbing,et al.  Social force model for pedestrian dynamics. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[2]  Wolfram Burgard,et al.  The dynamic window approach to collision avoidance , 1997, IEEE Robotics Autom. Mag..

[3]  Pieter Abbeel,et al.  Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[4]  Andrea R. Bill,et al.  Recommended Walking Speeds for Timing of Pedestrian Clearance Intervals Based on Characteristics of the Pedestrian Population , 2006 .

[5]  Dinesh Manocha,et al.  Reciprocal Velocity Obstacles for real-time multi-agent navigation , 2008, 2008 IEEE International Conference on Robotics and Automation.

[6]  Dinesh Manocha,et al.  Reciprocal n-Body Collision Avoidance , 2011, ISRR.

[7]  Maxim Likhachev,et al.  SIPP: Safe interval path planning for dynamic environments , 2011, 2011 IEEE International Conference on Robotics and Automation.

[8]  Wolfram Burgard,et al.  Feature-Based Prediction of Trajectories for Socially Compliant Navigation , 2012, Robotics: Science and Systems.

[9]  Ross A. Knepper,et al.  Pedestrian-inspired sampling-based multi-robot collision avoidance , 2012, 2012 IEEE RO-MAN: The 21st IEEE International Symposium on Robot and Human Interactive Communication.

[10]  Jonathan P. How,et al.  Dynamic Clustering via Asymptotics of the Dependent Dirichlet Process Mixture , 2013, NIPS.

[11]  Gonzalo Ferrer,et al.  Robot companion: A social-force based approach with human awareness-navigation in crowded environments , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[12]  Gonzalo Ferrer,et al.  Behavior estimation for a complete framework for human motion prediction in crowded environments , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[13]  David Hsu,et al.  Intention-aware online POMDP planning for autonomous driving in a crowd , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[14]  Julie A. Shah,et al.  Human-robot co-navigation using anticipatory indicators of human walking motion , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[15]  Dinesh Manocha,et al.  BRVO: Predicting pedestrian trajectories using velocity-space reasoning , 2015, Int. J. Robotics Res..

[16]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[17]  Andreas Krause,et al.  Robot navigation in dense human crowds: Statistical models and experimental studies of human–robot cooperation , 2015, Int. J. Robotics Res..

[18]  Sergey Levine,et al.  Learning deep control policies for autonomous aerial vehicles with MPC-guided policy search , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[19]  Jonathan P. How,et al.  Motion planning with diffusion maps , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[20]  Wolfram Burgard,et al.  Socially compliant mobile robot navigation via inverse reinforcement learning , 2016, Int. J. Robotics Res..

[21]  Joelle Pineau,et al.  Socially Adaptive Path Planning in Human Environments Using Inverse Reinforcement Learning , 2016, Int. J. Soc. Robotics.

[22]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[23]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[24]  Dhanvin Mehta,et al.  Autonomous navigation in dynamic social environments using Multi-Policy Decision Making , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[25]  Jonathan P. How,et al.  Dynamic arrival rate estimation for campus Mobility On Demand network graphs , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[26]  Michael Everett,et al.  Robot designed for socially acceptable navigation , 2017 .

[27]  Sergey Levine,et al.  (CAD)$^2$RL: Real Single-Image Flight without a Single Real Image , 2016, Robotics: Science and Systems.