Instance-Based Utile Distinctions for Reinforcement Learning with Hidden State
暂无分享,去创建一个
[1] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.
[2] Astro Teller,et al. The evolution of mental models , 1994 .
[3] Leslie Pack Kaelbling,et al. Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.
[4] Dana Ron,et al. Learning probabilistic automata with variable memory length , 1994, COLT '94.
[5] Michael L. Littman,et al. Memoryless policies: theoretical limitations and practical results , 1994 .
[6] Chelsea C. White,et al. Finite-Memory Suboptimal Design for Partially Observed Markov Decision Processes , 1994, Oper. Res..
[7] Andrew McCallum,et al. Instance-Based State Identification for Reinforcement Learning , 1994, NIPS.
[8] Michael I. Jordan,et al. Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems , 1994, NIPS.
[9] Andrew McCallum,et al. Overcoming Incomplete Perception with Utile Distinction Memory , 1993, ICML.
[10] J. Peng,et al. Efficient Learning and Planning Within the Dyna Framework , 1993, IEEE International Conference on Neural Networks.
[11] Andrew W. Moore,et al. Memory-Based Reinforcement Learning: Efficient Computation with Prioritized Sweeping , 1992, NIPS.
[12] Lonnie Chrisman,et al. Reinforcement Learning with Perceptual Aliasing: The Perceptual Distinctions Approach , 1992, AAAI.
[13] Sebastian Thrun,et al. Efficient Exploration In Reinforcement Learning , 1992 .
[14] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .
[15] Long Ji Lin,et al. Programming Robots Using Reinforcement Learning and Teaching , 1991, AAAI.
[16] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[17] C. Watkins. Learning from delayed rewards , 1989 .
[18] R. Lathe. Phd by thesis , 1988, Nature.
[19] M. K rn,et al. Stochastic Optimal Control , 1988 .
[20] John C. Reynolds,et al. School of Computer Science , 1992 .
[21] S. Ullman. Visual routines , 1984, Cognition.
[22] Loren K. Platzman,et al. Finite memory estimation and control of finite probabilistic systems , 1977 .
[23] L. Goddard,et al. Operations Research (OR) , 2007 .