论文信息 - Conditions for the convergence of one-layer networks under reinforcement learning

Conditions for the convergence of one-layer networks under reinforcement learning

An extension to the convergence theorem for single neurons learning under the AR-P algorithm is proved. The extension shows that if the conditions of the single-neuron theorem are satisfied and if the environment satisfies one of two sufficient conditions, the weights in an arbitrarily large one-layer network will converge with probability one to values with which the network correctly classifies the training input set. One condition requires that for all output vectors, the probability of reinforcement being one (success) has one of two values: the output vectors having at least lk correct elements have the higher probability, whereas the output vectors having less than lk correct elements have the lower probability. The alternative condition requires that the reinforcement have a higher probability of being one for output vectors having a higher number of correct elements. The extension and its proof are significant because they further the understanding of the factors affecting the convergence of multilayer networks under reinforcement learning

Mabo Robert Ito | P. D. Lawrence | J. C. C. Ip | M. Ito | P. Lawrence

[1] P. Anandan,et al. Pattern-recognizing stochastic learning automata , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[2] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .

[3] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .

[4] M. Raibert. Analytical equations vs. table look-up for manipulation: A unifying concept , 1977, 1977 IEEE Conference on Decision and Control including the 16th Symposium on Adaptive Processes and A Special Symposium on Fuzzy Set Theory and Applications.

[5] M. Minsky. The Society of Mind , 1986 .

[6] F. Downton. Stochastic Approximation , 1969, Nature.

[7] A G Barto,et al. Learning by statistical cooperation of self-interested neuron-like computing elements. , 1985, Human neurobiology.

[8] P. Anandan,et al. Cooperativity in Networks of Pattern Recognizing Stochastic Learning Automata , 1986 .

[9] Charles W. Anderson,et al. Learning and problem-solving with multilayer connectionist systems (adaptive, strategy learning, neural networks, reinforcement learning) , 1986 .

[10] Kumpati S. Narendra,et al. Learning automata - an introduction , 1989 .

[11] R. Kashyap,et al. 9 Stochastic Approximation , 1970 .

[12] S. Lakshmivarahan,et al. Learning Algorithms Theory and Applications , 1981 .

[13] David Zipser,et al. Feature Discovery by Competive Learning , 1985, Cogn. Sci..