Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning
-
爱吃猫的鱼0于 2022年2月28日 18:19
[1] Nils J. Nilsson,et al. Principles of Artificial Intelligence , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[2] Richard S. Sutton,et al. Learning and Sequential Decision Making , 1989 .
[3] Vijaykumar Gullapalli,et al. A stochastic reinforcement learning algorithm for learning real-valued functions , 1990, Neural Networks.
[4] Kumpati S. Narendra,et al. Learning automata - an introduction , 1989 .
[5] P. Anandan,et al. Pattern-recognizing stochastic learning automata , 1985, IEEE Transactions on Systems, Man, and Cybernetics.
[6] Kumpati S. Narendra,et al. An N-player sequential stochastic game with identical payoffs , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[7] Graham C. Goodwin,et al. Adaptive filtering prediction and control , 1984 .
[8] Geoffrey E. Hinton,et al. Learning and relearning in Boltzmann machines , 1986 .
[9] P. Werbos,et al. Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .
[10] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .
[11] K. Narendra,et al. Decentralized learning in finite Markov chains , 1985, 1985 24th IEEE Conference on Decision and Control.
[12] Yann LeCun,et al. Une procedure d'apprentissage pour reseau a seuil asymmetrique (A learning scheme for asymmetric threshold networks) , 1985 .
[13] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .
[14] M. A. L. THATHACHAR,et al. A new approach to the design of reinforcement schemes for learning automata , 1985, IEEE Transactions on Systems, Man, and Cybernetics.
[15] Jing Peng,et al. Function Optimization using Connectionist Reinforcement Learning Algorithms , 1991 .
[16] A G Barto,et al. Learning by statistical cooperation of self-interested neuron-like computing elements. , 1985, Human neurobiology.
[17] C. Watkins. Learning from delayed rewards , 1989 .
[18] Michael I. Jordan,et al. Forward Models: Supervised Learning with a Distal Teacher , 1992, Cogn. Sci..
[19] R. J. Williams,et al. On the use of backpropagation in associative reinforcement learning , 1988, IEEE 1988 International Conference on Neural Networks.