Reinforcement learning is direct adaptive optimal control
暂无分享,去创建一个
R.J. Williams | R.S. Sutton | A.G. Barto | R. Sutton | Ronald J. Williams | A. Barto | R.J. Williams
[1] P. Mandl,et al. Estimation and control in Markov chains , 1974, Advances in Applied Probability.
[2] Ian H. Witten,et al. An Adaptive Optimal Controller for Discrete-Time Markov Environments , 1977, Inf. Control..
[3] V. Borkar,et al. Adaptive control of Markov chains, I: Finite parameter set , 1979 .
[4] P. Kumar,et al. Optimal adaptive controllers for unknown Markov chains , 1982 .
[5] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[6] P. Anandan,et al. Pattern-recognizing stochastic learning automata , 1985, IEEE Transactions on Systems, Man, and Cybernetics.
[7] Richard Wheeler,et al. Decentralized learning in finite Markov chains , 1985, 1985 24th IEEE Conference on Decision and Control.
[8] Richard S. Sutton,et al. Training and Tracking in Robotics , 1985, IJCAI.
[9] Paul J. Werbos,et al. Building and Understanding Adaptive Systems: A Statistical/Numerical Approach to Factory Automation and Brain Research , 1987, IEEE Transactions on Systems, Man, and Cybernetics.
[10] Paul J. Werbos,et al. Neural networks for control and system identification , 1989, Proceedings of the 28th IEEE Conference on Decision and Control,.
[11] A. Jalali,et al. Computationally efficient adaptive control algorithms for Markov chains , 1989, Proceedings of the 28th IEEE Conference on Decision and Control,.
[12] Richard E. Korf,et al. Real-Time Heuristic Search , 1990, Artif. Intell..
[13] Andrew G. Barto,et al. On the Computational Economics of Reinforcement Learning , 1991 .
[14] Richard S. Sutton,et al. Dyna, an integrated architecture for learning, planning, and reacting , 1990, SGAR.
[15] Long-Ji Lin,et al. Self-improving reactive agents: case studies of reinforcement learning frameworks , 1991 .