Self-segmentation of sequences: automatic formation of hierarchies of sequential behaviors
暂无分享,去创建一个
[1] Jürgen Schmidhuber,et al. HQ-Learning , 1997, Adapt. Behav..
[2] Mark Humphrys,et al. W-learning: A simple RL-based Society of Mind , 1995 .
[3] Ian H. Witten,et al. Identifying Hierarchical Structure in Sequences: A linear-time algorithm , 1997, J. Artif. Intell. Res..
[4] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs , 1978, Oper. Res..
[5] G. Monahan. State of the Art—A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms , 1982 .
[6] Giovanni Soda,et al. Recurrent neural networks and prior knowledge for sequence processing: a constrained nondeterministic approach , 1995, Knowl. Based Syst..
[7] Andrew W. Moore,et al. The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces , 2004, Machine Learning.
[8] Masayuki Inaba,et al. Learning by watching: extracting reusable task knowledge from visual observation of human performance , 1994, IEEE Trans. Robotics Autom..
[9] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[10] Jürgen Schmidhuber,et al. Learning Unambiguous Reduced Sequence Descriptions , 1991, NIPS.
[11] Long Ji Lin,et al. Reinforcement Learning of Non-Markov Decision Processes , 1995, Artif. Intell..
[12] Asim Roy,et al. A neural-network learning theory and a polynomial time RBF algorithm , 1997, IEEE Trans. Neural Networks.
[13] Jürgen Schmidhuber,et al. Learning Complex, Extended Sequences Using the Principle of History Compression , 1992, Neural Computation.
[14] Earl D. Sacerdott. Planning in a hierarchy of abstraction spaces , 1973, IJCAI 1973.
[15] Stuart J. Russell,et al. Approximating Optimal Policies for Partially Observable Stochastic Domains , 1995, IJCAI.
[16] Leslie Pack Kaelbling,et al. Hierarchical Learning in Stochastic Domains: Preliminary Results , 1993, ICML.
[17] Andrew McCallum,et al. Reinforcement learning with selective perception and hidden state , 1996 .
[18] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.
[19] Doina Precup,et al. Multi-time Models for Temporally Abstract Planning , 1997, NIPS.
[20] Leslie Pack Kaelbling,et al. Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.
[21] Jeffrey L. Elman,et al. Finding Structure in Time , 1990, Cogn. Sci..
[22] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[23] Satinder Singh,et al. Learning to Solve Markovian Decision Processes , 1993 .
[24] Mark B. Ring. Incremental Development of Complex Behaviors , 1991, ML.
[25] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .
[26] Sebastian Thrun,et al. Finding Structure in Reinforcement Learning , 1994, NIPS.
[27] Justinian P. Rosca,et al. Evolution-Based Discovery of Hierarchical Behaviors , 1996, AAAI/IAAI, Vol. 1.
[28] Maja J. Matarić,et al. Learning to Use Selective Attention and Short-Term Memory in Sequential Tasks , 1996 .
[29] Qiang Yang,et al. Characterizing Abstraction Hierarchies for Planning , 1991, AAAI.
[30] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[31] Lonnie Chrisman,et al. Reinforcement Learning with Perceptual Aliasing: The Perceptual Distinctions Approach , 1992, AAAI.
[32] Gerhard Weiß,et al. Distributed reinforcement learning , 1995, Robotics Auton. Syst..
[33] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[34] Thomas G. Dietterich,et al. Hierarchical Explanation-Based Reinforcement Learning , 1997, ICML.
[35] C. Lee Giles,et al. Learning a class of large finite state machines with a recurrent neural network , 1995, Neural Networks.
[36] Ron Sun,et al. Learning Plans without a priori Knowledge , 2000, Adapt. Behav..
[37] Sridhar Mahadevan,et al. Automatic Programming of Behavior-Based Robots Using Reinforcement Learning , 1991, Artif. Intell..
[38] Chen K. Tham,et al. Reinforcement learning of multiple tasks using a hierarchical CMAC architecture , 1995, Robotics Auton. Syst..
[39] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[40] Michael I. Jordan,et al. Reinforcement Learning with Soft State Aggregation , 1994, NIPS.
[41] Richard S. Sutton,et al. TD Models: Modeling the World at a Mixture of Time Scales , 1995, ICML.
[42] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[43] Barbara Hayes-Roth,et al. Plans should abstractly describe intended behavior , 1996 .
[44] Ron Sun,et al. Multi-agent reinforcement learning: weighting and partitioning , 1999, Neural Networks.
[45] Qiang Yang,et al. Downward Refinement and the Efficiency of Hierarchical Problem Solving , 1994, Artif. Intell..