Regionalized Policy Representation for Reinforcement Learning in POMDPs
暂无分享,去创建一个
Hui Li | Lawrence Carin | Ronald Parr | Xuejun Liao | Ronald E. Parr | L. Carin | X. Liao | Hui Li
[1] Leslie Pack Kaelbling,et al. Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.
[2] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[3] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[4] Pieter Bram Bakker,et al. The state of mind : reinforcement learning with recurrent neural networks , 2004 .
[5] John Loch,et al. Using Eligibility Traces to Find the Best Memoryless Policy in Partially Observable Markov Decision Processes , 1998, ICML.
[6] Douglas Aberdeen,et al. Scalable Internal-State Policy-Gradient Methods for POMDPs , 2002, ICML.
[7] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[8] Kee-Eung Kim,et al. Learning Finite-State Controllers for Partially Observable Environments , 1999, UAI.
[9] Michael I. Jordan,et al. Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems , 1994, NIPS.
[10] Marco Wiering,et al. Utile distinction hidden Markov models , 2004, ICML.