Today, there are several drawbacks that impede the necessary and much needed use of robot learning techniques in real applications. First, the time needed to achieve the synthesis of any behavior is prohibitive. Second, the robot behavior during the learning phase is by definition bad, it may even be dangerous. Third, except within the lazy learning approach, a new behavior implies a new learning phase. We propose in this paper to use self-organizing maps to encode the nonexplicit model of the robot-world interaction sampled by the lazy memory, and then generate a robot behavior by means of situations to be achieved, i.e., points on the self-organizing maps. Any behavior can instantaneously be synthesized by the definition of a goal situation. Its performance will be minimal (not evidently bad) and will improve by the mere repetition of the behavior.
[1]
Juan Miguel Santos,et al.
Exploration tuned reinforcement function
,
1999,
Neurocomputing.
[2]
Teuvo Kohonen,et al.
Self-Organization and Associative Memory, Second Edition
,
1988,
Springer Series in Information Sciences.
[3]
C. Gallistel.
The organization of learning
,
1990
.
[4]
Teuvo Kohonen,et al.
Self-Organization and Associative Memory
,
1988
.
[5]
David W. Aha,et al.
Lazy Learning
,
1997,
Springer Netherlands.
[6]
Trevor Darrell.
Reinforcement Learning of Active Recognition Behaviors
,
1997,
NIPS 1997.
[7]
Claude F. Touzet,et al.
Neural reinforcement learning for behaviour synthesis
,
1997,
Robotics Auton. Syst..