Shared Autonomy via Hindsight Optimization

In shared autonomy, user input and robot autonomy are combined to control a robot to achieve a goal. Often, the robot does not know a priori which goal the user wants to achieve, and must both predict the user's intended goal, and assist in achieving that goal. We formulate the problem of shared autonomy as a Partially Observable Markov Decision Process with uncertainty over the user's goal. We utilize maximum entropy inverse optimal control to estimate a distribution over the user's goal based on the history of inputs. Ideally, the robot assists the user by solving for an action which minimizes the expected cost-to-go for the (unknown) goal. As solving the POMDP to select the optimal action is intractable, we use hindsight optimization to approximate the solution. In a user study, we compare our method to a standard predict-then-blend approach. We find that our method enables users to accomplish tasks more quickly while utilizing less input. However, when asked to rate each system, users were mixed in their assessment, citing a tradeoff between maintaining control authority and accomplishing tasks quickly.

[1]  Edward J. Sondik,et al.  The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..

[2]  Leslie Pack Kaelbling,et al.  Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.

[3]  Brenan J. McCarragher,et al.  Human integration into robot control utilising potential fields , 1997, Proceedings of International Conference on Robotics and Automation.

[4]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[5]  Robert Givan,et al.  A framework for simulation-based network control via hindsight optimization , 2000, Proceedings of the 39th IEEE Conference on Decision and Control (Cat. No.00CH37187).

[6]  Michael A. Goodrich,et al.  Characterizing efficiency of human robot interaction: a case study of shared-control teleoperation , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[7]  Robert Platt,et al.  Extracting User Intent in Mixed Initiative Teleoperator Control , 2004 .

[8]  Redwan Alqasemi,et al.  Telemanipulation Assistance Based on Motion Intention Recognition , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[9]  Jonathan Kofman,et al.  Teleoperation of a robot manipulator using a vision-based human-robot interface , 2005, IEEE Transactions on Industrial Electronics.

[10]  Danica Kragic,et al.  Adaptive Virtual Fixtures for Machine-Assisted Teleoperation Tasks , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[11]  Robert Givan,et al.  FF-Replan: A Baseline for Probabilistic Planning , 2007, ICAPS.

[12]  Anind K. Dey,et al.  Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.

[13]  Subbarao Kambhampati,et al.  Probabilistic Planning via Determinization in Hindsight , 2008, AAAI.

[14]  David Hsu,et al.  SARSOP: Efficient Point-Based POMDP Planning by Approximating Optimally Reachable Belief Spaces , 2008, Robotics: Science and Systems.

[15]  Siddhartha S. Srinivasa,et al.  Planning-based prediction for pedestrians , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[16]  Alan Fern,et al.  A Computational Decision Theory for Interactive Assistants , 2010, Interactive Decision Theory and Game Theory.

[17]  Jeff A. Bilmes,et al.  Simultaneous Learning and Covering with Adversarial Noise , 2011, ICML.

[18]  Leslie Pack Kaelbling,et al.  CAPIR: Collaborative Action Planning with Intention Recognition , 2011, AIIDE.

[19]  Kris K. Hauser,et al.  Assisted Teleoperation Strategies for Aggressively Controlling a Robot Arm with 2D Input , 2011, Robotics: Science and Systems.

[20]  Emilio Frazzoli,et al.  Intention-Aware Motion Planning , 2013, WAFR.

[21]  Anind K. Dey,et al.  Probabilistic pointing target prediction via inverse optimal control , 2012, IUI '12.

[22]  Leslie Pack Kaelbling,et al.  POMCoP: Belief Space Planning for Sidekicks in Cooperative Games , 2012, AIIDE.

[23]  Zhao Wang,et al.  How Autonomy Impacts Performance and Satisfaction: Results From a Study With Spinal Cord Injured Subjects Using an Assistive Robot , 2012, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[24]  Kris K. Hauser,et al.  Recognition, prediction, and planning for assisted teleoperation of freeform tasks , 2012, Autonomous Robots.

[25]  Siddhartha S. Srinivasa,et al.  A policy-blending formalism for shared control , 2013, Int. J. Robotics Res..

[26]  Bernhard Schölkopf,et al.  Probabilistic movement modeling for intention inference in human–robot interaction , 2013, Int. J. Robotics Res..

[27]  Maya Cakmak,et al.  Power to the People: The Role of Humans in Interactive Machine Learning , 2014, AI Mag..

[28]  Julie A. Shah,et al.  Decision-making authority, team efficiency and human worker satisfaction in mixed human–robot teams , 2015, Auton. Robots.

[29]  Peter Kulchyski and , 2015 .

[30]  Hema Swetha Koppula,et al.  Anticipating Human Activities Using Object Affordances for Reactive Robotic Response , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Siddhartha S. Srinivasa,et al.  Pre- and post-contact policy decomposition for planar contact manipulation under uncertainty , 2014, Int. J. Robotics Res..