Patching Approximate Solutions in Reinforcement Learning
暂无分享,去创建一个
[1] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[2] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[3] Andrew W. Moore,et al. Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time , 1993, Machine Learning.
[4] Richard E. Korf,et al. Real-Time Heuristic Search , 1990, Artif. Intell..
[5] Peter Stone,et al. Behavior transfer for value-function-based reinforcement learning , 2005, AAMAS '05.
[6] Nikos A. Vlassis,et al. Sparse cooperative Q-learning , 2004, ICML.
[7] Craig Boutilier,et al. Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..
[8] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[9] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[10] Nikos A. Vlassis,et al. Utile Coordination: Learning Interdependencies Among Cooperative Agents , 2005, CIG.
[11] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..