Modeling Supervisor Safe Sets for Improving Collaboration in Human-Robot Teams

When a human supervisor collaborates with a team of robots, the human's attention is divided, and cognitive resources are at a premium. We aim to optimize the distribution of these resources and the flow of attention. To this end, we propose the model of an idealized supervisor to describe human behavior. Such a supervisor employs a potentially inaccurate internal model of the the robots' dynamics to judge safety. We represent these safety judgements by constructing a safe set from this internal model using reachability theory. When a robot leaves this safe set, the idealized supervisor will intervene to assist, regardless of whether or not the robot remains objectively safe. False positives, where a human supervisor incorrectly judges a robot to be in danger, needlessly consume supervisor attention. In this work, we propose a method that decreases false positives by learning the supervisor's safe set and using that information to govern robot behavior. We prove that robots behaving according to our approach will reduce the occurrence of false positives for our idealized supervisor model. Furthermore, we empirically validate our approach with a user study that demonstrates a significant (p == 0.0328) reduction in false positives for our method compared to a baseline safety controller.

[1]  Jessica B. Hamrick,et al.  Simulation as an engine of physical scene understanding , 2013, Proceedings of the National Academy of Sciences.

[2]  F A Mussa-Ivaldi,et al.  Adaptive representation of dynamics during learning of a motor task , 1994, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[3]  Thorsten Joachims,et al.  Learning preferences for manipulation tasks from online coactive feedback , 2015, Int. J. Robotics Res..

[4]  Kevin A. Smith,et al.  Sources of uncertainty in intuitive physics , 2012, CogSci.

[5]  Siddhartha S. Srinivasa,et al.  Shared autonomy via hindsight optimization for teleoperation and teaming , 2017, Int. J. Robotics Res..

[6]  Claire J. Tomlin,et al.  Decentralized cooperative collision avoidance for acceleration constrained vehicles , 2008, 2008 47th IEEE Conference on Decision and Control.

[7]  Claire J. Tomlin,et al.  Applications of hybrid reachability analysis to robotic aerial vehicles , 2011, Int. J. Robotics Res..

[8]  Mo Chen,et al.  Fast reachable set approximations via state decoupling disturbances , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).

[9]  Claire J. Tomlin,et al.  Temporal-difference learning for online reachability analysis , 2015, 2015 European Control Conference (ECC).

[10]  Anind K. Dey,et al.  Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.

[11]  Mo Chen,et al.  FaSTrack: A modular framework for fast and guaranteed safe motion planning , 2017, 2017 IEEE 56th Annual Conference on Decision and Control (CDC).

[12]  Jaime F. Fisac,et al.  A General Safety Framework for Learning-Based Control in Uncertain Robotic Systems , 2017, IEEE Transactions on Automatic Control.

[13]  R. E. Kalman,et al.  When Is a Linear Control System Optimal , 1964 .

[14]  Mo Chen,et al.  Reach-avoid problems with time-varying dynamics, targets and constraints , 2014, HSCC.

[15]  Siddhartha S. Srinivasa,et al.  A policy-blending formalism for shared control , 2013, Int. J. Robotics Res..

[16]  Pieter Abbeel,et al.  Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[17]  Ian M. Mitchell,et al.  A Toolbox of Level Set Methods , 2005 .

[18]  S. Shankar Sastry,et al.  Dynamic inverse models in human-cyber-physical systems , 2016, Defense + Security.

[19]  Claire J. Tomlin,et al.  Towards Automated Conflict Resolution In Air Traffic Control 1 , 1999 .

[20]  Kristian Kirsch,et al.  Theory Of Ordinary Differential Equations , 2016 .

[21]  Sheue-Ling Hwang,et al.  Integration of humans and computers in the operation and control of flexible manufacturing systems , 1984 .

[22]  Mo Chen,et al.  Safe sequential path planning of multi-vehicle systems via double-obstacle Hamilton-Jacobi-Isaacs variational inequality , 2014, 2015 European Control Conference (ECC).

[23]  Mo Chen,et al.  Exact and efficient Hamilton-Jacobi guaranteed safety analysis via system decomposition , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[24]  Anca D. Dragan,et al.  Learning Robot Objectives from Physical Human Interaction , 2017, CoRL.