Continual learning in reinforcement environments
暂无分享,去创建一个
[1] I. Miller. Probability, Random Variables, and Stochastic Processes , 1966 .
[2] J. Albus. Mechanisms of planning and problem solving in the brain , 1979 .
[3] C. Roads,et al. The Handbook of Artificial Intelligence, Volume 1 , 1982 .
[4] Stephen Grossberg,et al. A Theory of Human Memory: Self-Organization and Performance of Sensory-Motor Codes, Maps, and Plans , 1982 .
[5] John S. Edwards,et al. The Hedonistic Neuron: A Theory of Memory, Learning and Intelligence , 1983 .
[6] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .
[7] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .
[8] Charles W. Anderson,et al. Learning and problem-solving with multilayer connectionist systems (adaptive, strategy learning, neural networks, reinforcement learning) , 1986 .
[9] Rodney A. Brooks,et al. A Robust Layered Control Syste For A Mobile Robot , 2022 .
[10] Robert E. Schapire,et al. A new approach to unsupervised learning in deterministic environments , 1990 .
[11] Colin Giles,et al. Learning, invariance, and generalization in high-order neural networks. , 1987, Applied optics.
[12] Stewart W. Wilson. Hierarchical Credit Allocation in a Classifier System , 1987, IJCAI.
[13] Terrence J. Sejnowski,et al. Parallel Networks that Learn to Pronounce English Text , 1987, Complex Syst..
[14] PAUL J. WERBOS,et al. Generalization of backpropagation with application to a recurrent gas market model , 1988, Neural Networks.
[15] Benjamin Kuipers,et al. A Robust, Qualitative Method for Robot Spatial Learning , 1988, AAAI.
[16] John Moody,et al. Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.
[17] C. Lee Giles,et al. Higher Order Recurrent Networks and Grammatical Inference , 1989, NIPS.
[18] B. Widrow,et al. The truck backer-upper: an example of self-learning in neural networks , 1989, International 1989 Joint Conference on Neural Networks.
[19] Ronald J. Williams,et al. Experimental Analysis of the Real-time Recurrent Learning Algorithm , 1989 .
[20] John N. Tsitsiklis,et al. Parallel and distributed computation , 1989 .
[21] Michael C. Mozer,et al. A Focused Backpropagation Algorithm for Temporal Pattern Recognition , 1989, Complex Syst..
[22] A. Barto,et al. Learning and Sequential Decision Making , 1989 .
[23] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.
[24] Christian Lebiere,et al. The Cascade-Correlation Learning Architecture , 1989, NIPS.
[25] Michael I. Jordan,et al. Learning to Control an Unstable System with Forward Modeling , 1989, NIPS.
[26] James L. McClelland,et al. Finite State Automata and Simple Recurrent Networks , 1989, Neural Computation.
[27] Ronald J. Williams,et al. A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.
[28] S. Fahlman. Fast-learning variations on back propagation: an empirical study. , 1989 .
[29] Alexander H. Waibel,et al. Modular Construction of Time-Delay Neural Networks for Speech Recognition , 1989, Neural Computation.
[30] Paul J. Werbos,et al. Backpropagation Through Time: What It Does and How to Do It , 1990, Proc. IEEE.
[31] Jing Peng,et al. An Efficient Gradient-Based Algorithm for On-Line Training of Recurrent Network Trajectories , 1990, Neural Computation.
[32] Sebastian Thrun,et al. Planning with an Adaptive World Model , 1990, NIPS.
[33] Alexander H. Waibel,et al. The Tempo 2 Algorithm: Adjusting Time-Delays By Supervised Learning , 1990, NIPS.
[34] Alexander Linden,et al. Inversion of neural networks by gradient descent , 1990, Parallel Comput..
[35] Jeffrey L. Elman,et al. Finding Structure in Time , 1990, Cogn. Sci..
[36] Jürgen Schmidhuber,et al. Networks adjusting networks , 1990, Forschungsberichte, TU Munich.
[37] J. Urgen Schmidhuber. Making the World Di erentiable: On Using Self-Supervised Fully Recurrent Neural Networks for Dynamic Reinforcement Learning , 1990 .
[38] Scott E. Fahlman,et al. The Recurrent Cascade-Correlation Architecture , 1990, NIPS.
[39] Robert E. Schapire,et al. A new approach to unsupervised learning in deterministic environments , 1990 .
[40] Dana H. Ballard,et al. Active Perception and Reinforcement Learning , 1990, Neural Computation.
[41] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[42] Gerald Fahner,et al. A higher order unit that performs arbitrary Boolean functions , 1990, 1990 IJCNN International Joint Conference on Neural Networks.
[43] Marcus Frean,et al. The Upstart Algorithm: A Method for Constructing and Training Feedforward Neural Networks , 1990, Neural Computation.
[44] Stewart W. Wilson. The animat path to AI , 1991 .
[45] J. Urgen Schmidhuber,et al. Adaptive confidence and adaptive curiosity , 1991, Forschungsberichte, TU Munich.
[46] Jürgen Schmidhuber,et al. A possibility for implementing curiosity and boredom in model-building neural controllers , 1991 .
[47] Jürgen Schmidhuber,et al. Learning Unambiguous Reduced Sequence Descriptions , 1991, NIPS.
[48] Lambert E. Wixson,et al. Scaling Reinforcement Learning Techniques via Modularity , 1991, ML.
[49] Lawrence Birnbaum,et al. Machine learning : proceedings of the Eighth International Workshop (ML91) , 1991 .
[50] H. L. Roitblat,et al. Cognitive action theory as a control architecture , 1991 .
[51] Terence D. Sanger,et al. A tree-structured adaptive network for function approximation in high-dimensional spaces , 1991, IEEE Trans. Neural Networks.
[52] C. Lee Giles,et al. Extracting and Learning an Unknown Grammar with Recurrent Neural Networks , 1991, NIPS.
[53] Sebastian Thrun,et al. Active Exploration in Dynamic Environments , 1991, NIPS.
[54] Richard S. Sutton,et al. Reinforcement learning architectures for animats , 1991 .
[55] Richard S. Sutton,et al. Iterative Construction of Sparse Polynomial Approximations , 1991, NIPS.
[56] Guo-Zheng Sun,et al. Green's Function Method for Fast On-Line Learning Algorithm of Recurrent Neural Networks , 1991, NIPS.
[57] Gary L. Drescher,et al. Made-up minds - a constructivist approach to artificial intelligence , 1991 .
[58] C. Jutten,et al. Gal: Networks That Grow When They Learn and Shrink When They Forget , 1991 .
[59] Benjamin Kuipers,et al. Learning hill-climbing functions as a strategy for generating behaviors in a mobile robot , 1991 .
[60] Kurt Hornik,et al. Approximation capabilities of multilayer feedforward networks , 1991, Neural Networks.
[61] Michael C. Mozer,et al. Induction of Multiscale Temporal Structure , 1991, NIPS.
[62] Alexander Linden,et al. On Discontinuous Q-Functions in Reinforcment Learning , 1992, GWAI.
[63] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[64] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .
[65] Michael I. Jordan,et al. Forward Models: Supervised Learning with a Distal Teacher , 1992, Cogn. Sci..
[66] Jürgen Schmidhuber,et al. A Fixed Size Storage O(n3) Time Complexity Learning Algorithm for Fully Recurrent Continually Running Networks , 1992, Neural Computation.
[67] Lonnie Chrisman,et al. Reinforcement Learning with Perceptual Aliasing: The Perceptual Distinctions Approach , 1992, AAAI.
[68] Raymond L. Watrous,et al. Induction of Finite-State Languages Using Second-Order Recurrent Networks , 1992, Neural Computation.
[69] Mark B. Ring. Learning Sequential Tasks by Incrementally Adding Higher Orders , 1992, NIPS.
[70] Michael C. Mozer,et al. A Connectionist Symbol Manipulator that Discovers the Structure of Context-Free Languages , 1992, NIPS.
[71] C. Atkeson,et al. Prioritized Sweeping : Reinforcement Learning withLess Data and Less Real , 1993 .
[72] Kenji Doya,et al. Universality of Fully-Connected Recurrent Neural Networks , 1993 .
[73] Michael L. Littman,et al. An optimization-based categorization of reinforcement learning environments , 1993 .
[74] Michael R. Davenport,et al. Continuous-time temporal back-propagation with adaptable time delays , 1993, IEEE Trans. Neural Networks.
[75] Mark Ring. Sequence Learning with Incremental Higher-Order Neural Networks , 1993 .
[76] Roderic A. Grupen,et al. Robust Reinforcement Learning in Motion Planning , 1993, NIPS.
[77] Leslie Pack Kaelbling,et al. Learning to Achieve Goals , 1993, IJCAI.
[78] Jürgen Schmidhuber,et al. Planning simple trajectories using neural subgoal generators , 1993 .
[79] Andrew McCallum,et al. Overcoming Incomplete Perception with Utile Distinction Memory , 1993, ICML.
[80] Frank Weber,et al. Implementing inner drive through competence reflection , 1993 .
[81] Mark Ring. Two methods for hierarchy learning in reinforcement environments , 1993 .
[82] Jing Peng,et al. Efficient Learning and Planning Within the Dyna Framework , 1993, Adapt. Behav..
[83] J. Peng,et al. Efficient Learning and Planning Within the Dyna Framework , 1993, IEEE International Conference on Neural Networks.
[84] Leslie Pack Kaelbling,et al. Hierarchical Learning in Stochastic Domains: Preliminary Results , 1993, ICML.
[85] C. L. Giles,et al. Constructive learning of recurrent neural networks , 1993, IEEE International Conference on Neural Networks.
[86] Rolf Eckmiller,et al. Structural adaptation of parsimonious higher-order neural classifiers , 1994, Neural Networks.
[87] Benjamin Kuipers,et al. Learning to Explore and Build Maps , 1994, AAAI.
[88] M. Goudreau,et al. First-order vs. Second-order Single Layer Recurrent Neural Networks , 1994 .
[89] Richard S. Sutton,et al. A Menu of Designs for Reinforcement Learning Over Time , 1995 .
[90] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[91] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..
[92] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.
[93] Michael I. Jordan. Serial Order: A Parallel Distributed Processing Approach , 1997 .