论文信息 - A Bibliography of Work Related to Reinforcement Learning - 字舞流文

A Bibliography of Work Related to Reinforcement Learning

Leslie Pack Kaelbling | Michael L. Littman | M. Littman | L. Kaelbling

[1] Andrew W. Moore,et al. Variable Resolution Reinforcement Learning. , 1995 .

[2] Richard S. Sutton,et al. The Truck Backer-Upper: An Example of Self-Learning in Neural Networks , 1995 .

[3] Richard S. Sutton,et al. A Menu of Designs for Reinforcement Learning Over Time , 1995 .

[4] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..

[5] Marco Colombetti,et al. Robot Shaping: Developing Autonomous Agents Through Learning , 1994, Artif. Intell..

[6] Maja J. Mataric,et al. Reward Functions for Accelerated Learning , 1994, ICML.

[7] Marco Dorigo,et al. A comparison of Q-learning and classifier systems , 1994 .

[8] W. Estes. Toward a Statistical Theory of Learning. , 1994 .

[9] Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.

[10] Volker Tresp,et al. A Trivial but Fast Reinforcement Controller , 1994 .

[11] Roderic A. Grupen,et al. Robust Reinforcement Learning in Motion Planning , 1993, NIPS.

[12] Mark Ring. Two methods for hierarchy learning in reinforcement environments , 1993 .

[13] Jürgen Schmidhuber,et al. Planning simple trajectories using neural subgoal generators , 1993 .

[14] Anton Schwartz,et al. A Reinforcement Learning Method for Maximizing Undiscounted Rewards , 1993, ICML.

[15] Richard S. Sutton,et al. Online Learning with Random Representations , 1993, ICML.

[16] Andrew McCallum,et al. Overcoming Incomplete Perception with Utile Distinction Memory , 1993, ICML.

[17] Leslie Pack Kaelbling,et al. Learning in embedded systems , 1993 .

[18] Jing Peng,et al. Efficient Learning and Planning Within the Dyna Framework , 1993, Adapt. Behav..

[19] Sebastian Thrun,et al. Exploration and model building in mobile robot domains , 1993, IEEE International Conference on Neural Networks.

[20] Gary McGraw,et al. Emergent Control and Planning in an Autonomous Vehicle , 1993 .

[21] Ronald J. Williams,et al. Analysis of Some Incremental Variants of Policy Iteration: First Steps Toward Understanding Actor-Cr , 1993 .

[22] Ronald J. Williams,et al. Tight Performance Bounds on Greedy Policies Based on Imperfect Value Functions , 1993 .

[23] Eduardo D. Sontag,et al. Neural Networks for Control , 1993 .

[24] Eduardo D. Sontag,et al. Some Topics in Neural Networks and Control , 1993 .

[25] Andrew W. Moore,et al. Memory-Based Reinforcement Learning: Efficient Computation with Prioritized Sweeping , 1992, NIPS.

[26] Steven J. Bradtke,et al. Reinforcement Learning Applied to Linear Quadratic Regulation , 1992, NIPS.

[27] Sebastian Thrun,et al. Explanation-Based Neural Network Learning for Robot Control , 1992, NIPS.

[28] Satinder P. Singh,et al. Reinforcement Learning with a Hierarchy of Abstract Models , 1992, AAAI.

[29] Satinder P. Singh,et al. Scaling Reinforcement Learning Algorithms by Learning Variable Temporal Resolution Models , 1992, ML.

[30] Judy A. Franklin,et al. Learning channel allocation strategies in real time , 1992, [1992 Proceedings] Vehicular Technology Society 42nd VTS Conference - Frontiers of Technology.

[31] D. Sofge. THE ROLE OF EXPLORATION IN LEARNING CONTROL , 1992 .

[32] S. Thrun. Eecient Exploration in Reinforcement Learning , 1992 .

[33] Andrew W. Moore,et al. Fast, Robust Adaptive Control by Learning only Forward Models , 1991, NIPS.

[34] Sebastian Thrun,et al. Active Exploration in Dynamic Environments , 1991, NIPS.

[35] Jürgen Schmidhuber,et al. Curious model-building control systems , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.

[36] Steven D. Whitehead,et al. A Complexity Analysis of Cooperative Mechanisms in Reinforcement Learning , 1991, AAAI.

[37] Ming Tan,et al. Cost-Sensitive Reinforcement Learning for Adaptive Classification and Control , 1991, AAAI.

[38] Lambert E. Wixson,et al. Scaling Reinforcement Learning Techniques via Modularity , 1991, ML.

[39] Steven D. Whitehead,et al. Complexity and Cooperation in Q-Learning , 1991, ML.

[40] Richard S. Sutton,et al. Reinforcement learning architectures for animats , 1991 .

[41] Jürgen Schmidhuber,et al. A possibility for implementing curiosity and boredom in model-building neural controllers , 1991 .

[42] Hans J. Bremermann,et al. How the Brain Adjusts Synapses - Maybe , 1991, Automated Reasoning: Essays in Honor of Woody Bledsoe.

[43] Jürgen Schmidhuber,et al. Learning to Generate Artificial Fovea Trajectories for Target Detection , 1991, Int. J. Neural Syst..

[44] Richard S. Sutton,et al. Planning by Incremental Dynamic Programming , 1991, ML.

[45] Dana H. Ballard,et al. Active Perception and Reinforcement Learning , 1990, Neural Computation.

[46] Jacques J. Vidal,et al. Adaptive Range Coding , 1990, NIPS.

[47] Jürgen Schmidhuber,et al. Reinforcement Learning in Markovian and Non-Markovian Environments , 1990, NIPS.

[48] Richard S. Sutton,et al. Advances in reinforcement learning and their implications for intelligent control , 1990, Proceedings. 5th IEEE International Symposium on Intelligent Control 1990.

[49] Andrew W. Moore,et al. Acquisition of Dynamic Control Knowledge for a Robotic Manipulator , 1990, ML.

[50] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.

[51] Ronald J. Williams,et al. Adaptive state representation and estimation using recurrent connectionist networks , 1990 .

[52] Richard S. Sutton,et al. Time-Derivative Models of Pavlovian Reinforcement , 1990 .

[53] Jürgen Schmidhuber,et al. Recurrent networks adjusted by adaptive critics , 1990 .

[54] Andrew G. Barto,et al. Connectionist learning for control , 1990 .

[55] Judy A. Franklin,et al. Historical perspective and state of the art in connectionist learning control , 1989, Proceedings of the 28th IEEE Conference on Decision and Control,.

[56] D. Ballard,et al. A Role for Anticipation in Reactive Systems that Learn , 1989, ML.

[57] W. Thomas Miller,et al. Real-time application of neural networks for sensor-based control of robots with vision , 1989, IEEE Trans. Syst. Man Cybern..

[58] Kumpati S. Narendra,et al. Learning automata - an introduction , 1989 .

[59] C.W. Anderson,et al. Learning to control an inverted pendulum using neural networks , 1989, IEEE Control Systems Magazine.

[60] John N. Tsitsiklis,et al. Parallel and distributed computation , 1989 .

[61] David H. Ackley,et al. Generalization and Scaling in Reinforcement Learning , 1989, NIPS.

[62] Wei-Min Shen,et al. Learning from the environment based on percepts and actions , 1989 .

[63] Jürgen Schmidhuber,et al. A Local Learning Algorithm for Dynamic Feedforward and Recurrent Networks , 1989 .

[64] R. J. Williams,et al. On the use of backpropagation in associative reinforcement learning , 1988, IEEE 1988 International Conference on Neural Networks.

[65] Judy A. Franklin. Compliance and learning: control skills for a robot operating in an uncertain world , 1988 .

[66] PAUL J. WERBOS,et al. Generalization of backpropagation with application to a recurrent gas market model , 1988, Neural Networks.

[67] P. W. Jones,et al. Bandit Problems, Sequential Allocation of Experiments , 1987 .

[68] John N. Tsitsiklis,et al. The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..

[69] Filson H. Glanz,et al. Application of a General Learning Algorithm to the Control of Robotic Manipulators , 1987 .

[70] W. Thomas Miller,et al. Sensor-based control of robotic manipulators using a general learning algorithm , 1987, IEEE J. Robotics Autom..

[71] Dimitri P. Bertsekas,et al. Dynamic Programming: Deterministic and Stochastic Models , 1987 .

[72] Charles W. Anderson,et al. Strategy Learning with Multilayer Connectionist Representations , 1987 .

[73] S. Thomas Alexander,et al. Adaptive Signal Processing , 1986, Texts and Monographs in Computer Science.

[74] Richard S. Sutton,et al. Training and Tracking in Robotics , 1985, IJCAI.

[75] P. Anandan,et al. Pattern-recognizing stochastic learning automata , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[76] A G Barto,et al. Learning by statistical cooperation of self-interested neuron-like computing elements. , 1985, Human neurobiology.

[77] M. A. L. THATHACHAR,et al. A new approach to the design of reinforcement schemes for learning automata , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[78] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .

[79] Hendrik Van Brussel,et al. A self-learning automaton with variable resolution for high precision assembly by industrial robots , 1982 .

[80] A G Barto,et al. Toward a modern theory of adaptive networks: expectation and prediction. , 1981, Psychological review.

[81] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs , 1978, Oper. Res..

[82] Ian H. Witten,et al. An Adaptive Optimal Controller for Discrete-Time Markov Environments , 1977, Inf. Control..

[83] Kumpati S. Narendra,et al. Learning Automata - A Survey , 1974, IEEE Trans. Syst. Man Cybern..

[84] Bernard Widrow,et al. Punish/Reward: Learning with a Critic in Adaptive Threshold Systems , 1973, IEEE Trans. Syst. Man Cybern..

[85] M. L. Tsetlin,et al. Automaton theory and modeling of biological systems , 1973 .

[86] R. Rescorla,et al. A theory of Pavlovian conditioning : Variations in the effectiveness of reinforcement and nonreinforcement , 1972 .

[87] A. S. Harding. Markovian decision processes , 1970 .

[88] Wilm E. Donath,et al. Hardware implementation , 1968, AFIPS '68 (Fall, part II).

[89] J. Laurie Snell,et al. Studies in mathematical learning theory. , 1960 .

[90] R. Howard. Dynamic Programming and Markov Processes , 1960 .

[91] Arthur L. Samuel,et al. Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..