About Q-values of Monte Carlo method
暂无分享,去创建一个
[1] Dana H. Ballard,et al. Active Perception and Reinforcement Learning , 1990, Neural Computation.
[2] Uemura Wataru,et al. SAPS: The Exploitation Reinforcement Learning Method on POMDPs , 2004 .
[3] J. Grefenstette. Credit Assignment in Rule Discovery Systems Based on Genetic Algorithms , 2005, Machine Learning.
[4] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[5] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[6] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[7] John J. Grefenstette,et al. Credit assignment in rule discovery systems based on genetic algorithms , 1988, Machine Learning.
[8] W. Uemura. About distributing rewards to a rule with probabilistic state transition , 2007, SICE Annual Conference 2007.
[9] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.