论文信息 - From Exploration to Planning

From Exploration to Planning

Learning and behaviour of mobile robots faces limitations. In reinforcement learning, for example, an agent learns a strategy to get to only one specific target point within a state space. However, we can grasp a visually localized object at any point in space or navigate to any position in a room. We present a neural network model in which an agent learns a model of the state space that allows him to get to an arbitrarily chosen goal via a short route. By randomly exploring the state space, the agent learns associations between two adjoining states and the action that links them. Given arbitrary starting and goal positions, route-finding is done in two steps. First, an activation gradient spreads around the goal position along the associative connections. Second, the agent uses state-action associations to determine the actions leading to ascend the gradient toward the goal. All mechanisms are biologically justifiable.

Cornelius Weber | Jochen Triesch | C. Weber | J. Triesch

[1] David J. Foster,et al. A model of hippocampally dependent navigation, using the temporal difference learning rule , 2000, Hippocampus.

[2] Pierre-Yves Oudeyer,et al. The Playground Experiment: Task-Independent Development of a Curious Robot , 2005 .

[3] J. Bolam,et al. Uniform Inhibition of Dopamine Neurons in the Ventral Tegmental Area by Aversive Stimuli , 2004, Science.

[4] W. Schultz,et al. Adaptive Coding of Reward Value by Dopamine Neurons , 2005, Science.

[5] Kurt Hornik,et al. Artificial Neural Networks — ICANN 2001 , 2001, Lecture Notes in Computer Science.

[6] G. Sandini,et al. Babybot : an artificial developing robotic agent , 2000 .

[7] Mark Witkowski,et al. An Action-Selection Calculus , 2007, Adapt. Behav..

[8] Pieter R. Roelfsema,et al. Attention-Gated Reinforcement Learning of Internal Representations for Classification , 2005, Neural Computation.

[9] Marco Iacoboni,et al. Beyond a Single Area: Motor Control and Language Within a Neural Architecture Encompassing Broca's Area , 2006, Cortex.

[10] J. Michael Herrmann,et al. Learning predictive representations , 2000, Neurocomputing.

[11] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .

[12] Patricia S Churchland,et al. Self‐Representation in Nervous Systems , 2003, Annals of the New York Academy of Sciences.

[13] Rufin van Rullen,et al. Rate Coding Versus Temporal Order Coding: What the Retinal Ganglion Cells Tell the Visual Cortex , 2001, Neural Computation.

[14] Cornelius Weber,et al. Self-Organization of Orientation Maps, Lateral Connections, and Dynamic Receptive Fields in the Primary Visual Cortex , 2001, ICANN.

[15] Ralf Der,et al. From Motor Babbling to Purposive Actions: Emerging Self-exploration in a Dynamical Systems Approach to Early Robot Development , 2006, SAB.

[16] Thomas Stützle,et al. Ant colony optimization: artificial ants as a computational intelligence technique , 2006 .

[17] Giulio Sandini,et al. Babybot: a biologically inspired developing robotic agent , 2000, SPIE Optics East.

[18] Jürgen Schmidhuber,et al. Optimal Artiﬁcial Curiosity, Creativity, Music, and the Fine Arts , 2005 .

[19] Peter Dayan,et al. Structure in the Space of Value Functions , 2002, Machine Learning.

[20] David C. Plaut,et al. The emergence of phonology from the interplay of speech comprehension and production ; A distributed connectionist approach , 1998 .

[21] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[22] R. Miall,et al. Connecting mirror neurons and forward models. , 2003, Neuroreport.

[23] Yiannis Demiris,et al. Learning Forward Models for Robots , 2005, IJCAI.

[24] P. R. Davidson,et al. Widespread access to predictive models in the motor system: a short review , 2005, Journal of neural engineering.

[25] Andrew McCallum,et al. Reinforcement learning with selective perception and hidden state , 1996 .