Modeling and Simulation of Elementary Robot Behaviors using Associative Memories

Today, there are several drawbacks that impede the necessary and much needed use of robot learning techniques in real applications. First, the time needed to achieve the synthesis of any behavior is prohibitive. Second, the robot behavior during the learning phase is – by definition – bad, it may even be dangerous. Third, except within the lazy learning approach, a new behavior implies a new learning phase. We propose in this paper to use associative memories (self-organizing maps) to encode the non explicit model of the robot-world interaction sampled by the lazy memory, and then generate a robot behavior by means of situations to be achieved, i.e., points on the self-organizing maps. Any behavior can instantaneously be synthesized by the definition of a goal situation. Its performance will be minimal (not necessarily bad) and will improve by the mere repetition of the behavior.

[1]  Claude F. Touzet,et al.  Robot Awareness in Cooperative Mobile Robot Learning , 2000, Auton. Robots.

[2]  Juan Miguel Santos,et al.  Exploration tuned reinforcement function , 1999, Neurocomputing.

[3]  Yann LeCun,et al.  Une procedure d'apprentissage pour reseau a seuil asymmetrique (A learning scheme for asymmetric threshold networks) , 1985 .

[4]  Teuvo Kohonen,et al.  Self-Organization and Associative Memory, Second Edition , 1988, Springer Series in Information Sciences.

[5]  Claude Touzet,et al.  Dynamic Update of the Reinforcement Function During Learning , 1999, Connect. Sci..

[6]  Claude Touzet Q-learning for Robots , 2003 .

[7]  Claude F. Touzet,et al.  Distributed Lazy Q-Learning for Cooperative Mobile Robots , 2004 .

[8]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[9]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[10]  D. Mackay The mystery of the mind W. Penfield, with discussions by Dr.W. Feindel, Prof. C. Hendel and SirC. Symonds Princeton University Press, New Jersey (1975). 156 pp., £4.60 , 1976, Neuroscience.

[11]  永福 智志 The Organization of Learning , 2005, Journal of Cognitive Neuroscience.

[12]  Steven Salzberg,et al.  A Teaching Strategy for Memory-Based Control , 1997, Artificial Intelligence Review.

[13]  Thomas P. Trappenberg,et al.  A multi-modular associator network for simple temporal sequence learning and generation , 2005, ESANN.

[14]  Jean-Claude Gilhodes,et al.  A neural network model for the intersensory coordination involved in goal-directed movements , 1991, Biological Cybernetics.

[15]  Teuvo Kohonen,et al.  Self-Organization and Associative Memory , 1988 .

[16]  Erkki Oja,et al.  Kohonen Maps , 1999, Encyclopedia of Machine Learning.

[17]  Andrew James Smith,et al.  Applications of the self-organising map to reinforcement learning , 2002, Neural Networks.

[18]  Trevor Darrell Reinforcement Learning of Active Recognition Behaviors , 1997, NIPS 1997.

[19]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .

[20]  Claude F. Touzet,et al.  Neural reinforcement learning for behaviour synthesis , 1997, Robotics Auton. Syst..

[21]  J. Lisman Relating Hippocampal Circuitry to Function Recall of Memory Sequences by Reciprocal Dentate–CA3 Interactions , 1999, Neuron.

[22]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .