Instance-Based Utile Distinctions for Reinforcement Learning with Hidden State

[1]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[2]  Astro Teller,et al.  The evolution of mental models , 1994 .

[3]  Leslie Pack Kaelbling,et al.  Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.

[4]  Dana Ron,et al.  Learning probabilistic automata with variable memory length , 1994, COLT '94.

[5]  Michael L. Littman,et al.  Memoryless policies: theoretical limitations and practical results , 1994 .

[6]  Chelsea C. White,et al.  Finite-Memory Suboptimal Design for Partially Observed Markov Decision Processes , 1994, Oper. Res..

[7]  Andrew McCallum,et al.  Instance-Based State Identification for Reinforcement Learning , 1994, NIPS.

[8]  Michael I. Jordan,et al.  Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems , 1994, NIPS.

[9]  Andrew McCallum,et al.  Overcoming Incomplete Perception with Utile Distinction Memory , 1993, ICML.

[10]  J. Peng,et al.  Efficient Learning and Planning Within the Dyna Framework , 1993, IEEE International Conference on Neural Networks.

[11]  Andrew W. Moore,et al.  Memory-Based Reinforcement Learning: Efficient Computation with Prioritized Sweeping , 1992, NIPS.

[12]  Lonnie Chrisman,et al.  Reinforcement Learning with Perceptual Aliasing: The Perceptual Distinctions Approach , 1992, AAAI.

[13]  Sebastian Thrun,et al.  Efficient Exploration In Reinforcement Learning , 1992 .

[14]  Long-Ji Lin,et al.  Reinforcement learning for robots using neural networks , 1992 .

[15]  Long Ji Lin,et al.  Programming Robots Using Reinforcement Learning and Teaching , 1991, AAAI.

[16]  Richard S. Sutton,et al.  Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.

[17]  C. Watkins Learning from delayed rewards , 1989 .

[18]  R. Lathe Phd by thesis , 1988, Nature.

[19]  M. K rn,et al.  Stochastic Optimal Control , 1988 .

[20]  John C. Reynolds,et al.  School of Computer Science , 1992 .

[21]  S. Ullman Visual routines , 1984, Cognition.

[22]  Loren K. Platzman,et al.  Finite memory estimation and control of finite probabilistic systems , 1977 .

[23]  L. Goddard,et al.  Operations Research (OR) , 2007 .

[24]  J. Smith Seattle , 1906 .