Approximating Optimal Policies for Partially Observable Stochastic Domains
暂无分享,去创建一个
[1] R. Bellman. Dynamic programming. , 1957, Science.
[2] E. J. Sondik,et al. The Optimal Control of Partially Observable Markov Decision Processes. , 1971 .
[3] PITTSBURGH , 1980, Bird Student.
[4] W. Lovejoy. A survey of algorithmic methods for partially observed Markov decision processes , 1991 .
[5] Lonnie Chrisman,et al. Reinforcement Learning with Perceptual Aliasing: The Perceptual Distinctions Approach , 1992, AAAI.
[6] Long Lin,et al. Memory Approaches to Reinforcement Learning in Non-Markovian Domains , 1992 .
[7] Thomas Martinetz,et al. 'Neural-gas' network for vector quantization and its application to time-series prediction , 1993, IEEE Trans. Neural Networks.
[8] Andrew McCallum,et al. Overcoming Incomplete Perception with Utile Distinction Memory , 1993, ICML.
[9] Leslie Pack Kaelbling,et al. Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.
[10] Michael I. Jordan,et al. Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems , 1994, NIPS.
[11] Stuart J. Russell,et al. Adaptive Probabilistic Networks , 1994 .
[12] M. Littman. The Witness Algorithm: Solving Partially Observable Markov Decision Processes , 1994 .
[13] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .
[14] Leslie Pack Kaelbling,et al. Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.