A novel learning control architecture is used for navigation. A sophisticated test-bed is used to simulate a cylindrical robot with a sonar belt in a planar environment. The task is short-range homing in the presence of obstacles. The robot receives no global information and assumes no comprehensive world model. Instead the robot receives only sensory information which is inherently limited. A connectionist architecture is presented which incorporates a large amount of a priori knowledge in the form of hard-wired networks, architectural constraints, and initial weights. Instead of hard-wiring static potential fields from object models, my architecture learns sensor-based potential fields, automatically adjusting them to avoid local minima and to produce efficient homing trajectories. It does this without object models using only sensory information. This research demonstrates the use of a large modular architecture on a difficult task.
[1]
Richard S. Sutton,et al.
Neuronlike adaptive elements that can solve difficult learning control problems
,
1983,
IEEE Transactions on Systems, Man, and Cybernetics.
[2]
Richard S. Sutton,et al.
Temporal credit assignment in reinforcement learning
,
1984
.
[3]
Richard S. Sutton,et al.
Sequential Decision Problems and Neural Networks
,
1989,
NIPS 1989.
[4]
IEEE Conference on Neural Information Processing Systems - Natural and Synthetic Held in Denver, Colorado on 28 November-1 December 1988
,
1989
.
[5]
Michael I. Jordan,et al.
Learning to Control an Unstable System with Forward Modeling
,
1989,
NIPS.
[6]
Michael I. Jordan,et al.
Forward Models: Supervised Learning with a Distal Teacher
,
1992,
Cogn. Sci..
[7]
Sridhar Mahadevan,et al.
Automatic Programming of Behavior-Based Robots Using Reinforcement Learning
,
1991,
Artif. Intell..