A reinforcement learning neural network for adaptive control of Markov chains
暂无分享,去创建一个
[1] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .
[2] N. Rouche,et al. Stability Theory by Liapunov's Direct Method , 1977 .
[3] P. Kumar,et al. Optimal adaptive controllers for unknown Markov chains , 1982 .
[4] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[5] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .
[6] Harold J. Kushner,et al. Approximation and Weak Convergence Methods for Random Processes , 1984 .
[7] P. Anandan,et al. Pattern-recognizing stochastic learning automata , 1985, IEEE Transactions on Systems, Man, and Cybernetics.
[8] Richard Wheeler,et al. Decentralized learning in finite Markov chains , 1985, 1985 24th IEEE Conference on Decision and Control.
[9] A G Barto,et al. Learning by statistical cooperation of self-interested neuron-like computing elements. , 1985, Human neurobiology.
[10] Patchigolla Kiran Kumar,et al. A Survey of Some Results in Stochastic Adaptive Control , 1985 .
[11] Mandayam A. L. Thathachar,et al. Learning Optimal Discriminant Functions through a Cooperative Game of Automata , 1987, IEEE Transactions on Systems, Man, and Cybernetics.
[12] R. Lippmann,et al. An introduction to computing with neural nets , 1987, IEEE ASSP Magazine.
[13] MITSUO SATO,et al. Learning control of finite Markov chains with an explicit trade-off between estimation and control , 1988, IEEE Trans. Syst. Man Cybern..
[14] Kumpati S. Narendra,et al. Learning automata - an introduction , 1989 .
[15] Pierre Priouret,et al. Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.
[16] Andrew G. Barto,et al. On the Computational Economics of Reinforcement Learning , 1991 .
[17] V. Borkar. Topics in controlled Markov chains , 1991 .
[18] Shouchuan Hu. Differential equations with discontinuous right-hand sides☆ , 1991 .
[19] Ronald J. Williams,et al. Analysis of Some Incremental Variants of Policy Iteration: First Steps Toward Understanding Actor-Cr , 1993 .