Adaptive Combination of Behaviors in an Agent
暂无分享,去创建一个
[1] Michael I. Jordan,et al. Learning Without State-Estimation in Partially Observable Markovian Decision Processes , 1994, ICML.
[2] Rodney A. Brooks,et al. A Robust Layered Control Syste For A Mobile Robot , 2022 .
[3] K. R. Dixon,et al. Incorporating Prior Knowledge and Previously Learned Information into Reinforcement Learning Agents , 2000 .
[4] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .
[5] Brian Sallans,et al. Learning Factored Representations for Partially Observable Markov Decision Processes , 1999, NIPS.
[6] Mark Humphreys,et al. Action selection methods using reinforcement learning , 1997 .
[7] Kee-Eung Kim,et al. Learning to Cooperate via Policy Search , 2000, UAI.
[8] Olivier Buffet,et al. Multi-Agent Systems by Incremental Gradient Reinforcement Learning , 2001, IJCAI.
[9] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[10] Andrew G. Barto,et al. Reinforcement learning , 1998 .
[11] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[12] Gang Wang,et al. Hierarchical Optimization of Policy-Coupled Semi-Markov Decision Processes , 1999, ICML.
[13] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[14] Manuela M. Veloso,et al. Multiagent Systems: A Survey from a Machine Learning Perspective , 2000, Auton. Robots.
[15] Doina Precup,et al. Multi-time Models for Temporally Abstract Planning , 1997, NIPS.