Improving the performance of complex agent plans through reinforcement learning
暂无分享,去创建一个
[1] John Loch,et al. Using Eligibility Traces to Find the Best Memoryless Policy in Partially Observable Markov Decision Processes , 1998, ICML.
[2] Blai Bonet,et al. Planning and Control in Artificial Intelligence: A Unifying Perspective , 2001, Applied Intelligence.
[3] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[4] Peter Stone,et al. The utility of temporal abstraction in reinforcement learning , 2008, AAMAS.
[5] David Andre,et al. Programmable Reinforcement Learning Agents , 2000, NIPS.
[6] Theodore J. Perkins,et al. On the Existence of Fixed Points for Q-Learning and Sarsa in Partially Observable Domains , 2002, ICML.
[7] Craig Boutilier,et al. Decision-Theoretic, High-Level Agent Programming in the Situation Calculus , 2000, AAAI/IAAI.
[8] Ola Pettersson,et al. Execution monitoring in robotics: A survey , 2005, Robotics Auton. Syst..
[9] Peter Stone,et al. Learning Complementary Multiagent Behaviors: A Case Study , 2009, RoboCup.
[10] Sheila A. McIlraith,et al. Decision-Theoretic GOLOG with Qualitative Preferences , 2006, KR.
[11] Theodore J. Perkins,et al. Reinforcement learning for POMDPs based on action values and stochastic optimization , 2002, AAAI/IAAI.
[12] A. Cassandra,et al. Exact and approximate algorithms for partially observable markov decision processes , 1998 .
[13] Michael I. Jordan,et al. Learning Without State-Estimation in Partially Observable Markovian Decision Processes , 1994, ICML.
[14] Peter Stone,et al. Reinforcement Learning for RoboCup Soccer Keepaway , 2005, Adapt. Behav..
[15] Mark D. Pendrith,et al. An Analysis of Direct Reinforcement Learning in Non-Markovian Domains , 1998, ICML.
[16] David Harel,et al. Statecharts: A Visual Formalism for Complex Systems , 1987, Sci. Comput. Program..
[17] Bhaskara Marthi,et al. Concurrent Hierarchical Reinforcement Learning , 2005, IJCAI.