Open Theoretical Questions in Reinforcement Learning
暂无分享,去创建一个
[1] David Elkind,et al. Learning: An Introduction , 1968 .
[2] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .
[3] C. Watkins. Learning from delayed rewards , 1989 .
[4] C. Atkeson,et al. Prioritized Sweeping : Reinforcement Learning withLess Data and Less Real , 1993 .
[5] Satinder Singh,et al. Learning to Solve Markovian Decision Processes , 1993 .
[6] Gerald Tesauro,et al. Temporal difference learning and TD-Gammon , 1995, CACM.
[7] Andrew G. Barto,et al. Improving Elevator Performance Using Reinforcement Learning , 1995, NIPS.
[8] Richard S. Sutton,et al. Generalization in ReinforcementLearning : Successful Examples UsingSparse Coarse , 1996 .
[9] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.
[10] Richard S. Sutton,et al. Reinforcement Learning with Replacing Eligibility Traces , 2005, Machine Learning.
[11] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[12] John N. Tsitsiklis,et al. Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.
[13] Dimitri P. Bertsekas,et al. Reinforcement Learning for Dynamic Channel Allocation in Cellular Telephone Systems , 1996, NIPS.
[14] John Loch,et al. Using Eligibility Traces to Find the Best Memoryless Policy in Partially Observable Markov Decision Processes , 1998, ICML.
[15] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[16] SRIDHAR MAHADEVAN,et al. Average Reward Reinforcement Learning: Foundations, Algorithms, and Empirical Results , 2005, Machine Learning.