Experimental Results on Learning Stochastic Memoryless Policies for Partially Observable Markov Decision Processes
暂无分享,去创建一个
[1] W. Lovejoy. A survey of algorithmic methods for partially observed Markov decision processes , 1991 .
[2] Lonnie Chrisman,et al. Reinforcement Learning with Perceptual Aliasing: The Perceptual Distinctions Approach , 1992, AAAI.
[3] Michael L. Littman,et al. Memoryless policies: theoretical limitations and practical results , 1994 .
[4] Michael I. Jordan,et al. Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems , 1994, NIPS.
[5] Michael I. Jordan,et al. Learning Without State-Estimation in Partially Observable Markovian Decision Processes , 1994, ICML.
[6] Stuart J. Russell,et al. Approximating Optimal Policies for Partially Observable Stochastic Domains , 1995, IJCAI.
[7] Leslie Pack Kaelbling,et al. Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.
[8] John Loch,et al. Using Eligibility Traces to Find the Best Memoryless Policy in Partially Observable Markov Decision Processes , 1998, ICML.
[9] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .