Socially Adaptive Path Planning in Human Environments Using Inverse Reinforcement Learning

A key skill for mobile robots is the ability to navigate efficiently through their environment. In the case of social or assistive robots, this involves navigating through human crowds. Typical performance criteria, such as reaching the goal using the shortest path, are not appropriate in such environments, where it is more important for the robot to move in a socially adaptive manner such as respecting comfort zones of the pedestrians. We propose a framework for socially adaptive path planning in dynamic environments, by generating human-like path trajectory. Our framework consists of three modules: a feature extraction module, inverse reinforcement learning (IRL) module, and a path planning module. The feature extraction module extracts features necessary to characterize the state information, such as density and velocity of surrounding obstacles, from a RGB-depth sensor. The inverse reinforcement learning module uses a set of demonstration trajectories generated by an expert to learn the expert’s behaviour when faced with different state features, and represent it as a cost function that respects social variables. Finally, the planning module integrates a three-layer architecture, where a global path is optimized according to a classical shortest-path objective using a global map known a priori, a local path is planned over a shorter distance using the features extracted from a RGB-D sensor and the cost function inferred from IRL module, and a low-level system handles avoidance of immediate obstacles. We evaluate our approach by deploying it on a real robotic wheelchair platform in various scenarios, and comparing the robot trajectories to human trajectories.

[1]  Panos E. Trahanias,et al.  Probabilistic Autonomous Robot Navigation in Dynamic Environments with Human Motion Prediction , 2010, Int. J. Soc. Robotics.

[2]  Rachid Alami,et al.  Exploiting human cooperation in human-centered robot navigation , 2010, 19th International Symposium in Robot and Human Interactive Communication.

[3]  Christian Vollmer,et al.  Learning to navigate through crowded environments , 2010, 2010 IEEE International Conference on Robotics and Automation.

[4]  J. Weickert,et al.  Lucas/Kanade meets Horn/Schunck: combining local and global optic flow methods , 2005 .

[5]  Sebastian Thrun,et al.  Apprenticeship learning for motion planning with application to parking lot navigation , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[6]  Ivan Petrovic,et al.  Dynamic window based approach to mobile robot motion control in the presence of moving obstacles , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[7]  Joelle Pineau,et al.  Design and Evaluation of a Flexible Interface for Spatial Navigation , 2012, 2012 Ninth Conference on Computer and Robot Vision.

[8]  Rachid Alami,et al.  A Human Aware Mobile Robot Motion Planner , 2007, IEEE Transactions on Robotics.

[9]  Leszek Wojnar,et al.  Image Analysis , 1998 .

[10]  Rachid Alami,et al.  Human-aware robot navigation: A survey , 2013, Robotics Auton. Syst..

[11]  Matthew R. Walter,et al.  Understanding Natural Language Commands for Robotic Navigation and Mobile Manipulation , 2011, AAAI.

[12]  Kai Oliver Arras,et al.  Multi-Hypothesis Social Grouping and Tracking for Mobile Robots , 2013, Robotics: Science and Systems.

[13]  Gunnar Farnebäck,et al.  Fast and Accurate Motion Estimation Using Orientation Tensors and Parametric Motion Models , 2000, ICPR.

[14]  Morgan Quigley,et al.  ROS: an open-source Robot Operating System , 2009, ICRA 2009.

[15]  Carl-Fredrik Westin,et al.  Representing Local Structure Using Tensors II , 2011, SCIA.

[16]  Anind K. Dey,et al.  Navigate like a cabbie: probabilistic reasoning from observed context-aware behavior , 2008, UbiComp.

[17]  Tatsuo Arai,et al.  Human-Robot Collision Avoidance using a modified Social force Model with Body Pose and Face Orientation , 2013, Int. J. Humanoid Robotics.

[18]  Marc Hanheide,et al.  Dynamic path planning adopting human navigation strategies for a domestic mobile robot , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[19]  Wolfram Burgard,et al.  People Tracking with Mobile Robots Using Sample-Based Joint Probabilistic Data Association Filters , 2003, Int. J. Robotics Res..

[20]  Joelle Pineau,et al.  SmartWheeler: A Robotic Wheelchair Test-Bed for Investigating New Models of Human-Robot Interaction , 2007, AAAI Spring Symposium: Multidisciplinary Collaboration for Socially Assistive Robotics.

[21]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[22]  R. Bellman A Markovian Decision Process , 1957 .

[23]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[24]  E. Gat On Three-Layer Architectures , 1997 .

[25]  Berthold K. P. Horn,et al.  Determining Optical Flow , 1981, Other Conferences.

[26]  Wolfram Burgard,et al.  Robotics: Science and Systems XV , 2010 .

[27]  Takayuki Kanda,et al.  Towards a Socially Acceptable Collision Avoidance for a Mobile Robot Navigating Among Pedestrians Using a Pedestrian Model , 2014, International Journal of Social Robotics.

[28]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[29]  Wolfram Burgard,et al.  Learning Motion Patterns of People for Compliant Robot Motion , 2005, Int. J. Robotics Res..

[30]  Andrew McCallum,et al.  Dynamic conditional random fields: factorized probabilistic models for labeling and segmenting sequence data , 2004, J. Mach. Learn. Res..

[31]  Gunnar Farnebäck Spatial Domain Methods for Orientation and Velocity Estimation , 1999 .

[32]  Wolfram Burgard,et al.  The dynamic window approach to collision avoidance , 1997, IEEE Robotics Autom. Mag..

[33]  Anthony Stentz Optimal and Efficient Path Planning for Unknown and Dynamic Environments , 1993 .

[34]  Michael J. Black,et al.  Secrets of optical flow estimation and their principles , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[35]  Alex M. Andrew,et al.  Artificial Intelligence and Mobile Robots , 1999 .

[36]  Luke S. Zettlemoyer,et al.  Learning to Parse Natural Language Commands to a Robot Control System , 2012, ISER.

[37]  Eyal Amir,et al.  Bayesian Inverse Reinforcement Learning , 2007, IJCAI.

[38]  Steven M. LaValle,et al.  Planning algorithms , 2006 .

[39]  Siddhartha S. Srinivasa,et al.  Planning-based prediction for pedestrians , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[40]  Andrew Y. Ng,et al.  Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[41]  Pieter Abbeel,et al.  Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[42]  Kee-Eung Kim,et al.  MAP Inference for Bayesian Inverse Reinforcement Learning , 2011, NIPS.

[43]  William Whittaker,et al.  Conditional particle filters for simultaneous mobile robot localization and people-tracking , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).

[44]  Pieter Abbeel,et al.  Autonomous Helicopter Aerobatics through Apprenticeship Learning , 2010, Int. J. Robotics Res..

[45]  Thomas Bak,et al.  Trajectory planning for robots in dynamic human environments , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[46]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .