Reinforcement learning and its application to control
暂无分享,去创建一个
[1] B. Skinner. Superstition in the pigeon. , 1948, Journal of experimental psychology.
[2] Y. T. Li,et al. Principles of optimalizing control systems and an application to the internal combusion engine , 1951 .
[3] J. Wolfowitz. On the Stochastic Approximation Method of Robbins and Monro , 1952 .
[4] J. Kiefer,et al. Stochastic Estimation of the Maximum of a Regression Function , 1952 .
[5] J. Doob. Stochastic processes , 1953 .
[6] W. A. Clark,et al. Simulation of self-organizing systems by digital computer , 1954, Trans. IRE Prof. Group Inf. Theory.
[7] A. Dvoretzky. On Stochastic Approximation , 1956 .
[8] Arthur L. Samuel,et al. Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..
[9] M. Stone,et al. Studies in mathematical learning theory. , 1960 .
[10] Marvin Minsky,et al. Steps toward Artificial Intelligence , 1995, Proceedings of the IRE.
[11] B. Widrow,et al. Generalization and information storage in network of adaline 'neurons' , 1962 .
[12] J. Orbach. Principles of Neurodynamics. Perceptrons and the Theory of Brain Mechanisms. , 1962 .
[13] K. Fu,et al. A heuristic approach to reinforcement learning control systems , 1965 .
[14] B. Chandrasekaran,et al. On expediency and convergence in variable structure automata , 1966 .
[15] Harley Bornbach,et al. An introduction to mathematical learning theory , 1967 .
[16] A. L. Samuel,et al. Some studies in machine learning using the game of checkers. II: recent progress , 1967 .
[17] A. Klopf,et al. An Evolutionary Pattern Recognition Network , 1969 .
[18] A. H. Klopf,et al. Brain Function and Adaptive Systems: A Heterostatic Theory , 1972 .
[19] Richard Fikes,et al. Learning and Executing Generalized Robot Plans , 1993, Artif. Intell..
[20] Allen Newell,et al. Human Problem Solving. , 1973 .
[21] M. L. Tsetlin,et al. Automaton theory and modeling of biological systems , 1973 .
[22] P. Werbos,et al. Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .
[23] E Harth,et al. Alopex: a stochastic method for determining visual receptive fields. , 1974, Vision research.
[24] Peter E. Hart,et al. Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.
[25] George N. Saridis,et al. Self-organizing control of stochastic systems , 1977 .
[26] W. K. Honig,et al. Handbook of Operant Behavior , 2022 .
[27] Teuvo Kohonen,et al. Associative memory. A system-theoretical approach , 1977 .
[28] J. Albus. Mechanisms of planning and problem solving in the brain , 1979 .
[29] James S. Albus,et al. Brains, behavior, and robotics , 1981 .
[30] Daniel E. Whitney,et al. Quasi-Static Assembly of Compliantly Supported Rigid Parts , 1982 .
[31] Hendrik Van Brussel,et al. A self-learning automaton with variable resolution for high precision assembly by industrial robots , 1982 .
[32] R. Sutton,et al. Simulation of anticipatory responses in classical conditioning by a neuron-like adaptive element , 1982, Behavioural Brain Research.
[33] Milan E. Soklic. Adaptive model for decision making , 1982, Pattern Recognit..
[34] J. Staddon. Adaptive behavior and learning , 1983 .
[35] Steven Edward Hampson,et al. A neural model of adaptive behavior , 1983 .
[36] D. Levine. The hedonistic neuron, a theory of memory, learning, and intelligence: A. Harry Klopf Hemisphere Press, Washington, New York, and London, 1982, 140 pp., $19.95 (paperback) , 1983 .
[37] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .
[38] Suguru Arimoto,et al. Bettering operation of Robots by learning , 1984, J. Field Robotics.
[39] Graham C. Goodwin,et al. Adaptive filtering prediction and control , 1984 .
[40] P. Anandan,et al. Pattern-recognizing stochastic learning automata , 1985, IEEE Transactions on Systems, Man, and Cybernetics.
[41] Richard S. Sutton,et al. Training and Tracking in Robotics , 1985, IJCAI.
[42] Richard E. Korf,et al. Macro-Operators: A Weak Method for Learning , 1985, Artif. Intell..
[43] A G Barto,et al. Learning by statistical cooperation of self-interested neuron-like computing elements. , 1985, Human neurobiology.
[44] Geoffrey E. Hinton,et al. A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..
[45] David Zipser,et al. Feature Discovery by Competive Learning , 1986, Cogn. Sci..
[46] R. E. Gustavson,et al. A Theory for the Three-Dimensional Mating of Chamfered Cylindrical Parts , 1985 .
[47] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .
[48] Charles W. Anderson,et al. Learning and problem-solving with multilayer connectionist systems (adaptive, strategy learning, neural networks, reinforcement learning) , 1986 .
[49] M. Minsky. The Society of Mind , 1986 .
[50] John H. Holland,et al. Escaping brittleness: the possibilities of general-purpose learning algorithms applied to parallel rule-based systems , 1995 .
[51] Michael A. Erdmann,et al. Using Backprojections for Fine Motion Planning with Uncertainty , 1986 .
[52] James L. McClelland,et al. Parallel Distributed Processing: Explorations in the Microstructure of Cognition : Psychological and Biological Models , 1986 .
[53] Bruce Randall Donald,et al. Robot motion planning with uncertainty in the geometric models of the robot and environment: A formal framework for error detection and recovery , 1986, Proceedings. 1986 IEEE International Conference on Robotics and Automation.
[54] King-Sun Fu,et al. Learning Control Systems-Review and Outlook , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[55] Paul E. Utgoff,et al. Learning to control a dynamic physical system , 1987, Comput. Intell..
[56] Michael Kuperstein,et al. Adaptive visual-motor coordination in multijoint robots using parallel architecture , 1987, Proceedings. 1987 IEEE International Conference on Robotics and Automation.
[57] Steven Jeffrey Gordon. Automated assembly using feature localization , 1987 .
[58] W. Thomas Miller,et al. Sensor-based control of robotic manipulators using a general learning algorithm , 1987, IEEE J. Robotics Autom..
[59] R. Lippmann,et al. An introduction to computing with neural nets , 1987, IEEE ASSP Magazine.
[60] Allen Newell,et al. SOAR: An Architecture for General Intelligence , 1987, Artif. Intell..
[61] Robert B. Allen,et al. Stochastic Learning Networks and their Electronic Implementation , 1987, NIPS.
[62] Steven J. Nowlan,et al. Gain Variation in Recurrent Error Propagation Networks , 1988, Complex Syst..
[63] Yoshiro Miyata,et al. The learning and planning of actions , 1988 .
[64] S. Lee,et al. Learning expert systems for robot fine motion control , 1988, Proceedings IEEE International Symposium on Intelligent Control 1988.
[65] Bernard Widrow,et al. Adaptive switching circuits , 1988 .
[66] V. Gullapalli. A Stochastic Algorithm for Learning Real-valued Functions via Reinforcement , 1988 .
[67] Russell Leighton,et al. Shaping schedules as a method for accelerated learning , 1988, Neural Networks.
[68] James L. McClelland,et al. Explorations in parallel distributed processing: a handbook of models, programs, and exercises , 1988 .
[69] O. G. Selfridge,et al. Pandemonium: a paradigm for learning , 1988 .
[70] M. Kawato,et al. Hierarchical neural network model for voluntary movement with application to robotics , 1988, IEEE Control Systems Magazine.
[71] A. Meystel,et al. Intelligent control in robotics , 1988 .
[72] Judy A. Franklin. Compliance and learning: control skills for a robot operating in an uncertain world , 1988 .
[73] Michael C. Mozer,et al. Skeletonization: A Technique for Trimming the Fat from a Network via Relevance Assessment , 1988, NIPS.
[74] Enis Ersü,et al. Learning Control Structures with Neuron-Like Associative Memory Systems , 1988 .
[75] Michael I. Jordan. Supervised learning and systems with excess degrees of freedom , 1988 .
[76] R. J. Williams,et al. On the use of backpropagation in associative reinforcement learning , 1988, IEEE 1988 International Conference on Neural Networks.
[77] Lorien Y. Pratt,et al. Comparing Biases for Minimal Network Construction with Back-Propagation , 1988, NIPS.
[78] Kumpati S. Narendra,et al. Learning automata - an introduction , 1989 .
[79] C. Watkins. Learning from delayed rewards , 1989 .
[80] Anuradha M. Annaswamy,et al. Stable Adaptive Systems , 1989 .
[81] C.W. Anderson,et al. Learning to control an inverted pendulum using neural networks , 1989, IEEE Control Systems Magazine.
[82] P. J. Werbos,et al. Backpropagation and neurocontrol: a review and prospectus , 1989, International 1989 Joint Conference on Neural Networks.
[83] Richard S. Sutton,et al. Learning and Sequential Decision Making , 1989 .
[84] Christian Lebiere,et al. The Cascade-Correlation Learning Architecture , 1989, NIPS.
[85] Michael I. Jordan,et al. Learning to Control an Unstable System with Forward Modeling , 1989, NIPS.
[86] Geoffrey E. Hinton. Connectionist Learning Procedures , 1989, Artif. Intell..
[87] Robert B. Allen,et al. Adaptive training for connectionist state machines , 1989, CSC '89.
[88] David J. Reinkensmeyer,et al. Using associative content-addressable memories to control robots , 1989, Proceedings, 1989 International Conference on Robotics and Automation.
[89] Warren P. Seering,et al. Assembly strategies for chamferless parts , 1989, Proceedings, 1989 International Conference on Robotics and Automation.
[90] Yann LeCun,et al. Optimal Brain Damage , 1989, NIPS.
[91] Geoffrey E. Hinton,et al. Evaluation of Adaptive Mixtures of Competing Experts , 1990, NIPS.
[92] S. Chipman. Foundations of Cognitive Science , 1990, Journal of Cognitive Neuroscience.
[93] Vijaykumar Gullapalli,et al. A stochastic reinforcement learning algorithm for learning real-valued functions , 1990, Neural Networks.
[94] Derrick H. Nguyen,et al. Truck backer-upper: an example of self-learning in neural networks , 1990, Defense, Security, and Sensing.
[95] Stephen José Hanson,et al. A stochastic version of the delta rule , 1990 .
[96] Kumpati S. Narendra,et al. Adaptive control using neural networks , 1990 .
[97] Mahesan Niranjan,et al. Neural networks and radial basis functions in classifying static speech patterns , 1990 .
[98] B. Ydstie. Forecasting and control using adaptive connectionist networks , 1990 .
[99] Andrew G. Barto,et al. Connectionist learning for control: an overview , 1990 .
[100] H. Harry Asada,et al. Teaching and learning of compliance using neural nets: representation and generation of nonlinear compliance , 1990, Proceedings., IEEE International Conference on Robotics and Automation.
[101] Marwan A. Jabri,et al. Weight Perturbation: An Optimal Architecture and Learning Technique for Analog VLSI Feedforward and Recurrent Multilayer Networks , 1991, Neural Computation.
[102] Alexis P. Wieland,et al. Evolving Controls for Unstable Systems , 1991 .
[103] Andrew G. Barto,et al. On the Computational Economics of Reinforcement Learning , 1991 .
[104] Stephen H. Lane,et al. Goal-directed encoding of task knowledge for robotic skill acquisition , 1991, Proceedings of the 1991 IEEE International Symposium on Intelligent Control.
[105] V. Gullapalli,et al. A comparison of supervised and reinforcement learning methods on a reinforcement learning task , 1991, Proceedings of the 1991 IEEE International Symposium on Intelligent Control.
[106] V. Gullapalli. Modeling cortical area 7a using Stochastic Real-Valued (SRV) units , 1991 .
[107] Michael I. Jordan,et al. Task Decomposition Through Competition in a Modular Connectionist Architecture: The What and Where Vision Tasks , 1990, Cogn. Sci..
[108] Hartmut Logemann,et al. Multivariable feedback design : J. M. Maciejowski , 1991, Autom..
[109] Michael I. Jordan,et al. Forward Models: Supervised Learning with a Distal Teacher , 1992, Cogn. Sci..
[110] G. Kane. Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol 1: Foundations, vol 2: Psychological and Biological Models , 1994 .
[111] Jerry M. Mendel,et al. Reinforcement-learning control and pattern recognition systems , 1994 .