论文信息 - Haptic Assistance via Inverse Reinforcement Learning

Haptic Assistance via Inverse Reinforcement Learning

In assistive teleoperation, an autonomous agent uses a prediction about a human user's intent to attempt to align the behavior of a controlled system with the human's goal, even if the human's own inputs are not perfectly aligned to that goal. Haptic Assistance achieves this effect by influencing the human through forces/torques applied to the human's control interface. In this work, we describe our method for creating such haptic assistance via Inverse Reinforcement Learning applied to successful task demonstrations. We then use our assistance method to examine the role that haptic feedback plays in assistive teleoperation. Through our user study, we find that when the assistance incorrectly predicts a user's intent, aiding the user via haptic feedback on their control interface, rather than directly modifying their input signal, is preferable and provides the user with a significantly greater sense of control over the system.

S. Shankar Sastry | Claire J. Tomlin | Vicenç Rúbies Royo | Dexter R. R. Scobee

[1] Mark Mulder,et al. Neuromuscular Analysis as a Guideline in designing Shared Control , 2010 .

[2] Siddhartha S. Srinivasa,et al. Assistive teleoperation of robot arms via automatic time-optimal mode switching , 2016, 2016 11th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[3] Mark Mulder,et al. Haptic shared control: smoothly shifting control authority? , 2012, Cognition, Technology & Work.

[4] J. Andrew Bagnell,et al. Maximum margin planning , 2006, ICML.

[5] Stuart J. Russell. Learning agents for uncertain environments (extended abstract) , 1998, COLT' 98.

[6] René van Paassen,et al. Artificial Force Field for Haptic Feedback in UAV Teleoperation , 2009, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[7] Duane T. McRuer,et al. Pilot-Induced Oscillations and Human Dynamic Behavior , 1995 .

[8] Russell H. Taylor,et al. Spatial Motion Constraints Using Virtual Fixtures Generated by Anatomy , 2007, IEEE Transactions on Robotics.

[9] Nadine B. Sarter,et al. Team Play with a Powerful and Independent Agent: Operational Experiences and Automation Surprises on the Airbus A-320 , 1997, Hum. Factors.

[10] Jing Ren,et al. Dynamic 3-D Virtual Fixtures for Minimally Invasive Beating Heart Procedures , 2008, IEEE Transactions on Medical Imaging.

[11] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.

[12] Siddhartha S. Srinivasa,et al. A policy-blending formalism for shared control , 2013, Int. J. Robotics Res..

[13] Hisyam Abdul Rahman,et al. Analysis of human hand kinematics: forearm pronation and supination , 2014 .

[14] R. Brent Gillespie,et al. Sharing Control Between Humans and Automation Using Haptic Interface: Primary and Secondary Task Performance Benefits , 2005, Hum. Factors.

[15] Pieter Abbeel,et al. Hierarchical Apprenticeship Learning with Application to Quadruped Locomotion , 2007, NIPS.

[16] Pieter Abbeel,et al. An Application of Reinforcement Learning to Aerobatic Helicopter Flight , 2006, NIPS.

[17] Karon E. MacLean,et al. Predictive haptic guidance: intelligent user assistance for the control of dynamic tasks , 2006, IEEE Transactions on Visualization and Computer Graphics.

[18] Jean-Paul Laumond,et al. From human to humanoid locomotion—an inverse optimal control approach , 2010, Auton. Robots.

[19] Zhao Wang,et al. How Autonomy Impacts Performance and Satisfaction: Results From a Study With Spinal Cord Injured Subjects Using an Assistive Robot , 2012, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[20] Kris K. Hauser,et al. Recognition, prediction, and planning for assisted teleoperation of freeform tasks , 2012, Auton. Robots.

[21] Pete Trautman,et al. Assistive Planning in Complex, Dynamic Environments: A Probabilistic Approach , 2015, 2015 IEEE International Conference on Systems, Man, and Cybernetics.

[22] Nicholas Roy,et al. Assisted Teleoperation Strategies for Aggressively Controlling a Robot Arm with 2D Input , 2012 .

[23] René van Paassen,et al. Haptic gas pedal support during visually distracted car following , 2010, IFAC HMS.

[24] Frans C. T. van der Helm,et al. A Biodynamic Feedthrough Model Based on Neuromuscular Principles , 2014, IEEE Transactions on Cybernetics.

[25] Louis B. Rosenberg,et al. Virtual fixtures: Perceptual tools for telerobotic manipulation , 1993, Proceedings of IEEE Virtual Reality Annual International Symposium.

[26] Roger J. Hubbold,et al. Navigation guided by artificial force fields , 1998, CHI.

[27] Allison M. Okamura,et al. Recognition of operator motions for real-time assistance using virtual fixtures , 2003, 11th Symposium on Haptic Interfaces for Virtual Environment and Teleoperator Systems, 2003. HAPTICS 2003. Proceedings..

[28] Siddhartha S. Srinivasa,et al. Shared Autonomy via Hindsight Optimization , 2015, Robotics: Science and Systems.

[29] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[30] Sören Hohmann,et al. Individual human behavior identification using an inverse reinforcement learning method , 2017, 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[31] Martial Hebert,et al. Autonomy infused teleoperation with application to brain computer interface controlled manipulation , 2017, Autonomous Robots.

[32] Claire J. Tomlin,et al. Applications of hybrid reachability analysis to robotic aerial vehicles , 2011, Int. J. Robotics Res..

[33] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[34] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.