An optimization-based categorization of reinforcement learning environments
暂无分享,去创建一个
[1] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .
[2] E. Kalai,et al. Finite Rationality and Interpersonal Complexity in Repeated Games , 1988 .
[3] David H. Ackley,et al. Generalization and Scaling in Reinforcement Learning , 1989, NIPS.
[4] Robert B. Allen,et al. Adaptive training for connectionist state machines , 1989, CSC '89.
[5] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[6] Stewart W. Wilson. The animat path to AI , 1991 .
[7] David H. Ackley,et al. Interactions between learning and evolution , 1991 .
[8] Richard S. Sutton,et al. Planning by Incremental Dynamic Programming , 1991, ML.
[9] Zoubin Ghahramani,et al. Temporal processing with connectionist networks , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.
[10] David H. Ackley,et al. Adaptation in Constant Utility Non-Stationary Environments , 1991, ICGA.
[11] S. Thrun. Eecient Exploration in Reinforcement Learning , 1992 .
[12] Alan F. Murray,et al. International Joint Conference on Neural Networks , 1993 .
[13] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..