Hierarchical robot learning for physical collaboration between humans and robots

Human-in-the-loop robot learning is an important ability for robotics in human-robot collaborative (HRC) tasks. The research of interactive learning mainly focuses on robot learning with human cognitive interaction. However, robot learning with human physical interaction remains a challenging problem, due to the stochastic of human control. In this paper, we present a hierarchical robot learning approach that includes two learning hierarchies for HRC tasks. High-level motion learning is to learn the motion policy for objects which used as the shared plan of robot and human. In low-level interactive learning, human action is first predicted by an Extend Kalman Filter (EKF) algorithm. Q-learning with function approximation is applied to select the optimal robot action with the guidance of the predicted human action. Finally, the proposed learning approach is validated on a UR5 robot. The results of our experiments show the presented learning approach enables the robot to adaptively coordinate with a human and produce an active contribution to the HRC tasks.

[1]  Andrea Lockerd Thomaz,et al.  Policy Shaping: Integrating Human Feedback with Reinforcement Learning , 2013, NIPS.

[2]  Maren Bennewitz,et al.  Learning optimal navigation actions for foresighted robot behavior during assistance tasks , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[3]  Danica Kragic,et al.  A sensorimotor reinforcement learning framework for physical Human-Robot Interaction , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[4]  Weihua Sheng,et al.  Using human motion estimation for human-robot cooperative manipulation , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[5]  Stefanos Nikolaidis,et al.  Improved human–robot team performance through cross-training, an approach inspired by human team training practices , 2015, Int. J. Robotics Res..

[6]  Stefan Wermter,et al.  Training Agents With Interactive Reinforcement Learning and Contextual Affordances , 2016, IEEE Transactions on Cognitive and Developmental Systems.

[7]  Stefan Schaal,et al.  Dynamics systems vs. optimal control--a unifying view. , 2007, Progress in brain research.

[8]  Siddhartha S. Srinivasa,et al.  A policy-blending formalism for shared control , 2013, Int. J. Robotics Res..

[9]  Jan Peters,et al.  Hierarchical Relative Entropy Policy Search , 2014, AISTATS.

[10]  Sami Haddadin,et al.  A Hierarchical Human-Robot Interaction-Planning Framework for Task Allocation in Collaborative Industrial Assembly Processes , 2017, ICRA 2017.

[11]  Wolfram Burgard,et al.  Experiences with an Interactive Museum Tour-Guide Robot , 1999, Artif. Intell..

[12]  Hongliang Guo,et al.  Hierarchical Interactive Learning for a HUman-Powered Augmentation Lower EXoskeleton , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[13]  Cagatay Basdogan,et al.  The role of roles: Physical cooperation between humans and robots , 2012, Int. J. Robotics Res..

[14]  Jan Peters,et al.  Policy Search for Motor Primitives in Robotics , 2008, NIPS 2008.

[15]  Andrea Lockerd Thomaz,et al.  Reinforcement Learning with Human Teachers: Evidence of Feedback and Guidance with Implications for Learning Performance , 2006, AAAI.

[16]  Cynthia Breazeal,et al.  Training a Robot via Human Feedback: A Case Study , 2013, ICSR.

[17]  Sridhar Mahadevan,et al.  Hierarchical Average Reward Reinforcement Learning , 2007, J. Mach. Learn. Res..

[18]  Chen Chen,et al.  Mixed initiative controller for simultaneous intervention, a model predictive control formulation , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).