论文信息 - Unsupervised Self-Development in a Multi-Reward Environment

Unsupervised Self-Development in a Multi-Reward Environment

Self-development is an important quality for artificial agents, allowing skill development or improvement. In this contribution we analyze this problem for a scenario with multiple rewards, some easier to reach than others. There is no provided sequence of tasks to enforce self-development; rather, the agent must have an intrinsic motivation to discover more dicult reward sources even if a trivial one is always at hand. Then, by removing simple reward sources, the development performance can be measured. We describe the scenario and discuss as well as measure the applicability of standard learning methods. Based on this analysis we present two techniques to allow the desired self-development: a learning rule for quick trajectory learning and a multimodel learning for multiple reward sources. Simulations show the validity of the presented methods.

Christian Goerick | Benjamin Dittes | C. Goerick | Benjamin Dittes

[1] Oliver Brock,et al. A Framework for Learning and Control in Intelligent Humanoid Robots , 2005, Int. J. Humanoid Robotics.

[2] François Michaud,et al. Reactive Planning in a Motivated Behavioral Architecture , 2005, AAAI.

[3] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[4] Sonia Chernova,et al. From Deliberative to Routine Behaviors: A Cognitively Inspired Action-Selection Mechanism for Routine Behavior Capture , 2007, Adapt. Behav..

[5] Giulio Sandini,et al. Developmental robotics: a survey , 2003, Connect. Sci..

[6] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .

[7] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[8] Stefan Schaal,et al. Incremental Online Learning in High Dimensions , 2005, Neural Computation.

[9] Jürgen Schmidhuber,et al. Optimal Artiﬁcial Curiosity, Creativity, Music, and the Fine Arts , 2005 .

[10] Pierre-Yves Oudeyer,et al. Intrinsic Motivation Systems for Autonomous Mental Development , 2007, IEEE Transactions on Evolutionary Computation.

[11] A. Lazaric,et al. Self-Development Framework for Reinforcement Learning Agents , 2006 .

[12] W. Seelen,et al. Usage of General Developmental Principles for Adaptation of Reactive Behavior , 2006 .

[13] Andrew G. Barto,et al. An Adaptive Robot Motivational System , 2006, SAB.