Comparing Value-Function Estimation Algorithms in Undiscounted Problems
暂无分享,去创建一个
[1] Carlos S. Kubrusly,et al. Stochastic approximation algorithms and applications , 1973, CDC 1973.
[2] Steven D. Whitehead,et al. Complexity and Cooperation in Q-Learning , 1991, ML.
[3] Boris Polyak,et al. Acceleration of stochastic approximation by averaging , 1992 .
[4] Sven Koenig,et al. Complexity Analysis of Real-Time Reinforcement Learning , 1992, AAAI.
[5] Reid G. Simmons,et al. Complexity Analysis of Real-Time Reinforcement Learning , 1993, AAAI.
[6] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .
[7] Csaba Szepesvári,et al. The Asymptotic Convergence-Rate of Q-learning , 1997, NIPS.
[8] Michael Kearns,et al. Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms , 1998, NIPS.
[9] Yishay Mansour,et al. A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes , 1999, Machine Learning.