Reinforcement Learning through Global Stochastic Search in N-MDPs
暂无分享,去创建一个
[1] James C. Spall,et al. Introduction to stochastic search and optimization - estimation, simulation, and control , 2003, Wiley-Interscience series in discrete mathematics and optimization.
[2] John Loch,et al. Using Eligibility Traces to Find the Best Memoryless Policy in Partially Observable Markov Decision Processes , 1998, ICML.
[3] Daniel Polani,et al. Learning RoboCup-Keepaway with Kernels , 2007, Gaussian Processes in Practice.
[4] Theodore J. Perkins,et al. Reinforcement learning for POMDPs based on action values and stochastic optimization , 2002, AAAI/IAAI.
[5] Peter Stone,et al. An empirical analysis of value function-based and policy search reinforcement learning , 2009, AAMAS.
[6] Michael L. Littman,et al. Memoryless policies: theoretical limitations and practical results , 1994 .
[7] H. Peyton Young,et al. Strategic Learning and Its Limits , 2004 .
[8] Peter Stone,et al. Reinforcement Learning for RoboCup Soccer Keepaway , 2005, Adapt. Behav..
[9] Mark D. Pendrith,et al. An Analysis of Direct Reinforcement Learning in Non-Markovian Domains , 1998, ICML.
[10] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[11] Paul A. Crook. Learning in a state of confusion : employing active perception and reinforcement learning in partially observable worlds , 2007 .
[12] Peter Stone,et al. Learning Complementary Multiagent Behaviors: A Case Study , 2009, RoboCup.
[13] Peter Auer,et al. Near-optimal Regret Bounds for Reinforcement Learning , 2008, J. Mach. Learn. Res..
[14] Risto Miikkulainen,et al. Evolving Soccer Keepaway Players Through Task Decomposition , 2005, Machine Learning.
[15] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[16] Richard S. Sutton,et al. Reinforcement learning with replacing eligibility traces , 2004, Machine Learning.
[17] Luca Iocchi,et al. Improving the performance of complex agent plans through reinforcement learning , 2010, AAMAS.
[18] Shimon Whiteson,et al. Transfer via inter-task mappings in policy search reinforcement learning , 2007, AAMAS '07.
[19] Theodore J. Perkins,et al. On the Existence of Fixed Points for Q-Learning and Sarsa in Partially Observable Domains , 2002, ICML.
[20] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .