论文信息 - Instance-Based Utile Distinctions for Reinforcement Learning with Hidden State - 字舞流文

Instance-Based Utile Distinctions for Reinforcement Learning with Hidden State

Andrew McCallum | A. McCallum

[1] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.

[2] Astro Teller,et al. The evolution of mental models , 1994 .

[3] Leslie Pack Kaelbling,et al. Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.

[4] Dana Ron,et al. Learning probabilistic automata with variable memory length , 1994, COLT '94.

[5] Michael L. Littman,et al. Memoryless policies: theoretical limitations and practical results , 1994 .

[6] Chelsea C. White,et al. Finite-Memory Suboptimal Design for Partially Observed Markov Decision Processes , 1994, Oper. Res..

[7] Andrew McCallum,et al. Instance-Based State Identification for Reinforcement Learning , 1994, NIPS.

[8] Michael I. Jordan,et al. Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems , 1994, NIPS.

[9] Andrew McCallum,et al. Overcoming Incomplete Perception with Utile Distinction Memory , 1993, ICML.

[10] J. Peng,et al. Efficient Learning and Planning Within the Dyna Framework , 1993, IEEE International Conference on Neural Networks.

[11] Andrew W. Moore,et al. Memory-Based Reinforcement Learning: Efficient Computation with Prioritized Sweeping , 1992, NIPS.

[12] Lonnie Chrisman,et al. Reinforcement Learning with Perceptual Aliasing: The Perceptual Distinctions Approach , 1992, AAAI.

[13] Sebastian Thrun,et al. Efficient Exploration In Reinforcement Learning , 1992 .

[14] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .

[15] Long Ji Lin,et al. Programming Robots Using Reinforcement Learning and Teaching , 1991, AAAI.

[16] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.

[17] C. Watkins. Learning from delayed rewards , 1989 .

[18] R. Lathe. Phd by thesis , 1988, Nature.

[19] M. K rn,et al. Stochastic Optimal Control , 1988 .

[20] John C. Reynolds,et al. School of Computer Science , 1992 .

[21] S. Ullman. Visual routines , 1984, Cognition.

[22] Loren K. Platzman,et al. Finite memory estimation and control of finite probabilistic systems , 1977 .

[23] L. Goddard,et al. Operations Research (OR) , 2007 .

[24] J. Smith. Seattle , 1906 .