Anticipatory Planning for Human-Robot Teams

When robots work alongside humans for performing collaborative tasks, they need to be able to anticipate human’s future actions and plan appropriate actions. The tasks we consider are performed in contextually-rich environments containing objects, and there is a large variation in the way humans perform these tasks. We use a graphical model to represent the state-space, where we model the humans through their low-level kinematics as well as their high-level intent, and model their interactions with the objects through physically-grounded object affordances. This allows our model to anticipate a belief about possible future human actions, and we model the human’s and robot’s behavior through an MDP in this rich state-space. We further discuss that due to perception errors and the limitations of the model, the human may not take the optimal action and therefore we present robot’s anticipatory planning with different behaviors of the human within the model’s scope. In experiments on Cornell Activity Dataset, we show that our method performs better than various baselines for collaborative planning.

[1]  Markus Vincze,et al.  Supervised learning of hidden and non-hidden 0-order affordances and detection in real scenes , 2012, 2012 IEEE International Conference on Robotics and Automation.

[2]  James M. Rehg,et al.  Learning Visual Object Categories for Robot Affordance Prediction , 2010, Int. J. Robotics Res..

[3]  Eren Erdal Aksoy,et al.  Learning the semantics of object–action relations by observation , 2011, Int. J. Robotics Res..

[4]  Subhash Suri,et al.  Multiagent Pursuit Evasion, or Playing Kabaddi , 2010, WAFR.

[5]  Michael L. Littman,et al.  Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[6]  Takeo Kanade,et al.  Automated Construction of Robotic Manipulation Programs , 2010 .

[7]  Sonia Martínez,et al.  An approximate dual subgradient algorithm for multi-agent non-convex optimization , 2010, 49th IEEE Conference on Decision and Control (CDC).

[8]  Barry Ridge,et al.  Unsupervised Learning of Basic Object Affordances from Object Properties , 2009 .

[9]  Calin Belta,et al.  Incremental synthesis of control policies for heterogeneous multi-agent systems with linear temporal logic specifications , 2013, 2013 IEEE International Conference on Robotics and Automation.

[10]  Emre Ugur,et al.  Affordance learning from range data for multi-step planning , 2009, EpiRob.

[11]  Michael L. Littman,et al.  Friend-or-Foe Q-learning in General-Sum Games , 2001, ICML.

[12]  Gerald Tesauro,et al.  Extending Q-Learning to General Adaptive Multi-Agent Systems , 2003, NIPS.

[13]  Hema Swetha Koppula,et al.  Learning human activities and object affordances from RGB-D videos , 2012, Int. J. Robotics Res..

[14]  Joachim Hertzberg,et al.  Grounding planning operators by affordances , 2008 .

[15]  Luc De Raedt,et al.  Statistical Relational Learning of Object Affordances for Robotic Manipulation , 2011, ILP.

[16]  Siddhartha S. Srinivasa,et al.  Human preferences for robot-human hand-over configurations , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[17]  Rachid Alami,et al.  A Human Aware Mobile Robot Motion Planner , 2007, IEEE Transactions on Robotics.

[18]  Julie Shah,et al.  Human-Robot Teaming using Shared Mental Models , 2012 .

[19]  N. McGlynn Thinking fast and slow. , 2014, Australian veterinary journal.

[20]  Yun Jiang,et al.  Hallucinated Humans as the Hidden Context for Labeling 3D Scenes , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Dmitry Berenson,et al.  Human-robot collaborative manipulation planning using early prediction of human motion , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[22]  Rachid Alami,et al.  Spatial reasoning for human robot interaction , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[23]  Michael L. Littman,et al.  Classes of Multiagent Q-learning Dynamics with epsilon-greedy Exploration , 2010, ICML.

[24]  José Santos-Victor,et al.  Visual learning by imitation with motor representations , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[25]  David Carmel,et al.  Opponent Modeling in Multi-Agent Systems , 1995, Adaption and Learning in Multi-Agent Systems.

[26]  Jeffrey C. Trinkle,et al.  Controller design for human-robot interaction , 2008, Auton. Robots.

[27]  Manuel Lopes,et al.  Learning Object Affordances: From Sensory--Motor Coordination to Imitation , 2008, IEEE Transactions on Robotics.

[28]  Siddhartha S. Srinivasa,et al.  Toward seamless human-robot handovers , 2013, Journal of Human-Robot Interaction.

[29]  Javier Alonso-Mora,et al.  A message-passing algorithm for multi-agent trajectory planning , 2013, NIPS.

[30]  James M. Rehg,et al.  Affordance Prediction via Learned Object Attributes , 2011 .

[31]  Sven Koenig,et al.  Progress on Agent Coordination with Cooperative Auctions , 2010, AAAI.

[32]  Hema Swetha Koppula,et al.  Anticipating Human Activities Using Object Affordances for Reactive Robotic Response , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Hema Swetha Koppula,et al.  Physically Grounded Spatio-temporal Object Affordances , 2014, ECCV.

[34]  Sonia Martínez,et al.  An Approximate Dual Subgradient Algorithm for Multi-Agent Non-Convex Optimization , 2010, IEEE Transactions on Automatic Control.

[35]  J. Gibson The Ecological Approach to Visual Perception , 1979 .

[36]  Rachid Alami,et al.  Planning human-aware motions using a sampling-based costmap planner , 2011, 2011 IEEE International Conference on Robotics and Automation.

[37]  Hema Swetha Koppula,et al.  Learning Spatio-Temporal Structure from RGB-D Videos for Human Activity Detection and Anticipation , 2013, ICML.

[38]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[39]  Kostas E. Bekris,et al.  Minimizing Conflicts Between Moving Agents over a Set of Non-Homotopic Paths Through Regret Minimization , 2013, AAAI 2013.

[40]  David Henige,et al.  The ecological approach , 1982 .

[41]  J. Andrew Bagnell,et al.  Perceiving, learning, and exploiting object affordances for autonomous pile manipulation , 2013, Auton. Robots.

[42]  Michael P. Wellman,et al.  Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm , 1998, ICML.

[43]  Sinan Kalkan,et al.  Learning Social Affordances and Using Them for Planning , 2013, CogSci.

[44]  Stefanos Nikolaidis,et al.  Human-robot cross-training: Computational formulation, modeling and evaluation of a human team training strategy , 2013, 2013 8th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[45]  Martin Lauer,et al.  An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems , 2000, ICML.

[46]  James M. Rehg,et al.  Decoupling behavior, perception, and control for autonomous learning of affordances , 2013, 2013 IEEE International Conference on Robotics and Automation.

[47]  Ying Wu,et al.  Mining actionlet ensemble for action recognition with depth cameras , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[48]  Vijay Kumar,et al.  Trajectory Planning and Assignment in Multirobot Systems , 2012, WAFR.

[49]  Scott Kuindersma,et al.  Robot learning from demonstration by constructing skill trees , 2012, Int. J. Robotics Res..

[50]  Bart De Schutter,et al.  A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[51]  Yoav Shoham,et al.  New Criteria and a New Algorithm for Learning in Multi-Agent Systems , 2004, NIPS.