Modeling and Solving Human-Robot Collaborative Tasks Using POMDPs

Robots have helped in the automation of repetitive tasks. However, they have limited roles as co-workers or assistants to humans because of their limited sensing and perception abilities, which in turn leads to limited inference capabilities. To help with a task a robot can learn to infer its human user’s state in a process, which is hidden information. This inference is multimodal and over a large set of observations. Previous dialogue modeling work [22] has framed collaborative tasks, but they do not consider physical state-action spaces. In this work we first model a joint task between a human and a robot as a POMDP [7] with turn taking and common goals. Next we present a POMDP solver capable of handling large state spaces and an infinite number of observations to perform multimodal inference. We compare this POMDP solver with other state of the art POMDP solvers. Further we used this solver for the collaborative task of changing a child’s diaper on a robotic platform.

[1]  Sebastian Thrun,et al.  Stanley: The robot that won the DARPA Grand Challenge: Research Articles , 2006 .

[2]  David Silver,et al.  Combining online and offline knowledge in UCT , 2007, ICML '07.

[3]  Luke S. Zettlemoyer,et al.  Learning to Parse Natural Language Commands to a Robot Control System , 2012, ISER.

[4]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[5]  Nicholas Roy,et al.  Spoken language interaction with model uncertainty: an adaptive human–robot interaction system , 2008, Connect. Sci..

[6]  N. Nilsson STUART RUSSELL AND PETER NORVIG, ARTIFICIAL INTELLIGENCE: A MODERN APPROACH , 1996 .

[7]  Steve J. Young,et al.  Natural actor and belief critic: Reinforcement algorithm for learning parameters of dialogue systems modelled as POMDPs , 2011, TSLP.

[8]  Joelle Pineau,et al.  Anytime Point-Based Approximations for Large POMDPs , 2006, J. Artif. Intell. Res..

[9]  Edward J. Sondik,et al.  The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..

[10]  Joelle Pineau,et al.  Point-based value iteration: An anytime algorithm for POMDPs , 2003, IJCAI.

[11]  Matthew R. Walter,et al.  Understanding Natural Language Commands for Robotic Navigation and Mobile Manipulation , 2011, AAAI.

[12]  Wolfram Burgard,et al.  Experiences with an Interactive Museum Tour-Guide Robot , 1999, Artif. Intell..

[13]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[14]  Stefan Schaal,et al.  Natural Actor-Critic , 2003, Neurocomputing.

[15]  Joelle Pineau,et al.  Experiences with a mobile robotic guide for the elderly , 2002, AAAI/IAAI.

[16]  Crystal Chao Timing multimodal turn-taking for human-robot cooperation , 2012, ICMI '12.

[17]  Yishay Mansour,et al.  A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes , 1999, Machine Learning.

[18]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[19]  Joel Veness,et al.  Monte-Carlo Planning in Large POMDPs , 2010, NIPS.