Learning Policies with External Memory
暂无分享,去创建一个
[1] C. Watkins. Learning from delayed rewards , 1989 .
[2] D.E. Goldberg,et al. Classifier Systems and Genetic Algorithms , 1989, Artif. Intell..
[3] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.
[4] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[5] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[6] Michael I. Jordan,et al. Learning Without State-Estimation in Partially Observable Markovian Decision Processes , 1994, ICML.
[7] Mark B. Ring. Continual learning in reinforcement environments , 1995, GMD-Bericht.
[8] Andrew McCallum,et al. Reinforcement learning with selective perception and hidden state , 1996 .
[9] Milos Hauskrecht,et al. Planning and control in stochastic domains with imperfect information , 1997 .
[10] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[11] John Loch,et al. Using Eligibility Traces to Find the Best Memoryless Policy in Partially Observable Markov Decision Processes , 1998, ICML.
[12] Alex M. Andrew,et al. Reinforcement Learning: : An Introduction , 1998 .
[13] Eric A. Hansen,et al. Solving POMDPs by Searching in Policy Space , 1998, UAI.
[14] Andrew W. Moore,et al. Gradient Descent for General Reinforcement Learning , 1998, NIPS.
[15] Mario Martin. Reinforcement Learning for Embedded Agents facing Complex Tasks , 1998 .
[16] Shlomo Zilberstein,et al. Finite-memory control of partially observable systems , 1998 .
[17] Kee-Eung Kim,et al. Solving POMDPs by Searching the Space of Finite Policies , 1999, UAI.
[18] Kee-Eung Kim,et al. Learning Finite-State Controllers for Partially Observable Environments , 1999, UAI.
[19] Jean-Louis Deneubourg,et al. From local actions to global tasks: stigmergy and collective robotics , 2000 .