论文信息 - A Agent Environment StatesActions Rewards Critic B Agent Internal Environment Rewards Critic External Environment Sensations StatesDecisions Actions " Organism "

A Agent Environment StatesActions Rewards Critic B Agent Internal Environment Rewards Critic External Environment Sensations StatesDecisions Actions " Organism "

Psychologists call behavior intrinsically motivated when it is engaged in for its own sake rather than as a step toward solving a specific problem of clear practical value. But what we learn during intrinsically motivated behavior is essential for our development as competent autonomous entities able to efficiently solve a wide range of practical problems as they arise. In this paper we present initial results from a computational study of intrinsically motivated reinforcement learning aimed at allowing artificial agents to construct and extend hierarchies of reusable skills that are needed for competent autonomy.

Satinder Singh

[1] P. Dayan,et al. Reward, Motivation, and Reinforcement Learning , 2002, Neuron.

[2] Peter Dayan,et al. Dopamine: generalization and bonuses , 2002, Neural Networks.

[3] James L. McClelland,et al. Autonomous Mental Development by Robots and Animals , 2001, Science.

[4] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[5] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.

[6] Richard S. Sutton,et al. Integrated Modeling and Control Based on Reinforcement Learning and Dynamic Programming , 1990, NIPS 1990.

[7] R. W. White. Motivation reconsidered: the concept of competence. , 1959, Psychological review.

[8] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[9] Pierre-Yves Oudeyer,et al. Motivational principles for visual know-how development , 2003 .

[10] Andrew G. Barto,et al. Autonomous discovery of temporal abstractions from interaction with an environment , 2002 .

[11] Stanley J. Rosenschein,et al. From Animals to Animats: Proceedings of the First International Conference on Simulation of Adaptive Behavior , 1996 .

[12] T. Nokes,et al. Intrinsic reinforcing properties of putatively neutral stimuli in an instrumental two-lever discrimination task , 1996 .