论文信息 - Henrik Christensen Behavior Adaptation for a Socially Interactive Robot

Henrik Christensen Behavior Adaptation for a Socially Interactive Robot

This report addresses the problem of making a humanoid robot learn a human partner’s preferences regarding personal space and adapt to these in real-time. An adaptive system using policy gradient reinforcement learning (PGRL) is proposed, implemented and evaluated in an experiment using human subjects. The experiment shows that this is a viable solution to the problem, but that there are some issues that remain to be resolved. Beteendeanpassning för en socialt interaktiv robot

Henrik Christensen | Christian Smith | H. Christensen | CHRISTIAN SMITH

[1] Donald W. Fiske,et al. Face-to-face interaction: Research, methods, and theory , 1977 .

[2] Ian R. Fasel,et al. Face-to-face interactive humanoid robot , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[3] Leslie Pack Kaelbling,et al. Practical Reinforcement Learning in Continuous Spaces , 2000, ICML.

[4] Yasushi Nakauchi,et al. A Social Robot that Stands in Line , 2000, Proceedings. 2000 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2000) (Cat. No.00CH37113).

[5] Tetsuo Ono,et al. A constructive approach for developing interactive humanoid robots , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[6] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[7] E. Sundstrom,et al. Interpersonal relationships and personal space: Research review and theoretical model , 1976 .

[8] Tetsuo Ono,et al. Development and evaluation of an interactive humanoid robot "Robovie" , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).

[9] Peter Stone,et al. A social reinforcement learning agent , 2001, AGENTS '01.

[10] Brian Scassellati,et al. A Context-Dependent Attention System for a Social Robot , 1999, IJCAI.

[11] Chris Drummond,et al. Accelerating Reinforcement Learning by Composing Solutions of Automatically Identified Subtasks , 2011, J. Artif. Intell. Res..

[12] Takayuki Kanda,et al. Robot behavior adaptation for human-robot interaction based on policy gradient reinforcement learning , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[13] T. Ogata,et al. Dynamic communication of humanoid robot with multiple people based on interaction distance , 2004, RO-MAN 2004. 13th IEEE International Workshop on Robot and Human Interactive Communication (IEEE Catalog No.04TH8759).

[14] Masayuki Inaba,et al. Integration model of learning mechanism and dialogue strategy based on stochastic experience representation using Bayesian network , 2000, Proceedings 9th IEEE International Workshop on Robot and Human Interactive Communication. IEEE RO-MAN 2000 (Cat. No.00TH8499).

[15] Takayuki Kanda,et al. A practical experiment with interactive humanoid robots in a human society , 2003 .

[16] Jodi Forlizzi,et al. All robots are not created equal: the design and perception of humanoid robot heads , 2002, DIS '02.

[17] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[18] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..

[19] Douglas Aberdeen,et al. Policy-Gradient Algorithms for Partially Observable Markov Decision Processes , 2003 .

[20] Maja J. Mataric,et al. Getting Humanoids to Move and Imitate , 2000, IEEE Intell. Syst..

[21] Vijay Kumar,et al. Using policy gradient reinforcement learning on autonomous robot controllers , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[22] Takayuki Kanda,et al. Navigation for human-robot interaction tasks , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[23] H. Sebastian Seung,et al. Stochastic policy gradient reinforcement learning on a simple 3D biped , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[24] A. Kendon. Conducting Interaction: Patterns of Behavior in Focused Encounters , 1990 .

[25] Illah R. Nourbakhsh,et al. A survey of socially interactive robots , 2003, Robotics Auton. Syst..

[26] Wolfram Burgard,et al. The Interactive Museum Tour-Guide Robot , 1998, AAAI/IAAI.

[27] Gene F. Franklin,et al. Feedback Control of Dynamic Systems , 1986 .

[28] Monica N. Nicolescu,et al. Learning and interacting in human-robot domains , 2001, IEEE Trans. Syst. Man Cybern. Part A.

[29] Shigeki Sugano,et al. Open-End Human Robot Interaction from the Dynamical Systems Perspective: Mutual Adaptation and Incremental Learning , 2004, International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems.

[30] Hiroaki Kitano,et al. Learning in psychologically plausible conditions: the case of an imaginary pet robot , 1998 .

[31] Satoru Hayamizu,et al. Socially Embedded Learning of the Office-Conversant Mobil Robot Jijo-2 , 1997, IJCAI.

[32] Tetsuo Ono,et al. Physical relation and expression: joint attention for human-robot interaction , 2003, IEEE Trans. Ind. Electron..

[33] E. Hall,et al. The Hidden Dimension , 1970 .

[34] Peter Stone,et al. Policy gradient reinforcement learning for fast quadrupedal locomotion , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[35] Leslie Pack Kaelbling,et al. Effective reinforcement learning for mobile robots , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).

[36] Ross D. Shachter,et al. Using background knowledge to speed reinforcement learning in physical agents , 2001, AGENTS '01.

[37] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[38] Wolfram Burgard,et al. Using EM to learn motion behaviors of persons with mobile robots , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[39] Takayuki Kanda,et al. Adaptation of an Interactive Robot's Behavior Using Policy Gradient Reinforcement Learning , 2005 .