Reinforcement Learning in Non-Markov Environments
暂无分享,去创建一个
[1] W. J. Langford. Statistical Methods , 1959, Nature.
[2] Arthur L. Samuel,et al. Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..
[3] A. L. Yarbus,et al. Eye Movements and Vision , 1967, Springer US.
[4] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .
[5] S. Ullman. Visual routines , 1984, Cognition.
[6] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .
[7] John H. Holland,et al. Escaping brittleness: the possibilities of general-purpose learning algorithms applied to parallel rule-based systems , 1995 .
[8] Dimitri P. Bertsekas,et al. Dynamic Programming: Deterministic and Stochastic Models , 1987 .
[9] John H. Holland,et al. Empirical studies of default hierarchies and sequences of rules in learning classifier systems , 1988 .
[10] Philip E. Agre,et al. The dynamic structure of everyday life , 1988 .
[11] David W. Aha,et al. Incremental, Instance-Based Learning of Independent and Graded Concept Descriptions , 1989, ML.
[12] Lashon B. Booker,et al. Triggered Rule Discovery in Classifier Systems , 1989, ICGA.
[13] Alexander H. Waibel,et al. Modular Construction of Time-Delay Neural Networks for Speech Recognition , 1989, Neural Computation.
[14] Jürgen Schmidhuber,et al. Reinforcement Learning in Markovian and Non-Markovian Environments , 1990, NIPS.
[15] Paul E. Utgoff,et al. Explaining Temporal Differences to Create Useful Concepts for Evaluating States , 1990, AAAI.
[16] Sebastian Thrun,et al. Planning with an Adaptive World Model , 1990, NIPS.
[17] Jeffrey L. Elman,et al. Finding Structure in Time , 1990, Cogn. Sci..
[18] J. Urgen Schmidhuber. Making the World Di erentiable: On Using Self-Supervised Fully Recurrent Neural Networks for Dynamic Reinforcement Learning , 1990 .
[19] Dana H. Ballard,et al. Active Perception and Reinforcement Learning , 1990, Neural Computation.
[20] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[21] Dana H. Ballard,et al. Animate Vision , 1991, Artif. Intell..
[22] Long Ji Lin,et al. Programming Robots Using Reinforcement Learning and Teaching , 1991, AAAI.
[23] Ming Tan,et al. Cost-Sensitive Reinforcement Learning for Adaptive Classification and Control , 1991, AAAI.
[24] Satinder P. Singh,et al. Transfer of Learning Across Compositions of Sequentail Tasks , 1991, ML.
[25] Long-Ji Lin,et al. Self-improving reactive agents: case studies of reinforcement learning frameworks , 1991 .
[26] Ming Tan,et al. Cost-sensitive robot learning , 1991 .
[27] S. Thrun. Eecient Exploration in Reinforcement Learning , 1992 .
[28] Sridhar Mahadevan,et al. Automatic Programming of Behavior-Based Robots Using Reinforcement Learning , 1991, Artif. Intell..
[29] Steven Douglas Whitehead,et al. Reinforcement learning for the adaptive control of perception and action , 1992 .
[30] Paul E. Utgoff,et al. A Teaching Method for Reinforcement Learning , 1992, ML.
[31] Long Lin,et al. Memory Approaches to Reinforcement Learning in Non-Markovian Domains , 1992 .
[32] Leslie Pack Kaelbling,et al. Learning in embedded systems , 1993 .
[33] Jonas Karlsson,et al. Learning Multiple Goal Behavior via Task Decomposition and Dynamic Policy Merging , 1993 .