Robot behavior adaptation for human-robot interaction based on policy gradient reinforcement learning

In this paper, we propose an adaptation mechanism for robot behaviors to make robot-human interactions run more smoothly. We propose such a mechanism based on reinforcement learning, which reads minute body signals from a human partner, and uses this information to adjust interaction distances, gaze meeting, and motion speed and timing in human-robot interaction. We show that this enables autonomous adaptation to individual preferences by an experiment with twelve subjects.

[1]  E. Hall,et al.  The Hidden Dimension , 1970 .

[2]  E. Sundstrom,et al.  Interpersonal relationships and personal space: Research review and theoretical model , 1976 .

[3]  Wells Goodrich,et al.  Face‐to‐Face Interaction: Research, Methods, and Theory , 1979 .

[4]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[5]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[6]  Peter L. Bartlett,et al.  Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..

[7]  Peter Stone,et al.  A social reinforcement learning agent , 2001, AGENTS '01.

[8]  Tetsunari Inamura Masayuki Inaba Hirochika Acquisition of Probabilistic Behavior Decision Model based on the Interactive Teaching Method , 2001 .

[9]  Tetsuo Ono,et al.  Development and evaluation of an interactive humanoid robot "Robovie" , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).

[10]  Yasushi Nakauchi,et al.  A Social Robot that Stands in Line , 2002, Auton. Robots.

[11]  Tetsuo Ono,et al.  Body Movement Analysis of Human-Robot Interaction , 2003, IJCAI.

[12]  Peter Stone,et al.  Policy gradient reinforcement learning for fast quadrupedal locomotion , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[13]  Peter Dayan,et al.  Technical Note: Q-Learning , 2004, Machine Learning.

[14]  Hiroshi G. Okuno,et al.  Dynamic communication of humanoid robot with multiple people based on interaction distance , 2004 .

[15]  T. Ogata,et al.  Dynamic communication of humanoid robot with multiple people based on interaction distance , 2004, RO-MAN 2004. 13th IEEE International Workshop on Robot and Human Interactive Communication (IEEE Catalog No.04TH8759).