Reinforcement Learning: On Being Wise During the Event
暂无分享,去创建一个
A backchaining algorithm, an example of unsupervised sequential decision learning in an unknown environment, is defined. The algorithm is described as a neural network implementation and its behaviour is demonstrated through a simulation of a goal finding system in a two-dimensional world. The relationship between this model and some important concepts in animal learning theory are briefly discussed.
[1] Paul J. Werbos,et al. Consistency of HDP applied to a simple reinforcement learning problem , 1990, Neural Networks.
[2] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[3] C. L. Hull. The goal-gradient hypothesis and maze learning. , 1932 .
[4] C. L. Hull. The concept of the habit-family hierarchy, and maze learning. Part I. , 1934 .