An Idea of Using Reinforcement Learning in Adaptive Control Systems
暂无分享,去创建一个
Leszek Koszalka | Iwona Pozniak-Koszalka | Radoslaw Rudek | L. Koszalka | I. Pozniak-Koszalka | Radosław Rudek
[1] A. Barto,et al. Learning and Sequential Decision Making , 1989 .
[2] Richard S. Sutton,et al. Dyna, an integrated architecture for learning, planning, and reacting , 1990, SGAR.
[3] L. Koszalka. A concept of adaptive control system for experimentation and controlling described by relation systems , 1994 .
[4] Dana H. Ballard,et al. Learning to perceive and act by trial and error , 1991, Machine Learning.
[5] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.
[6] MITSUO SATO,et al. Learning control of finite Markov chains with an explicit trade-off between estimation and control , 1988, IEEE Trans. Syst. Man Cybern..
[7] R. Sutton,et al. Connectionist Learning for Control: An Overview , 1989 .
[8] Yishay Mansour,et al. Learning Rates for Q-learning , 2004, J. Mach. Learn. Res..
[9] P. Anandan,et al. Pattern-recognizing stochastic learning automata , 1985, IEEE Transactions on Systems, Man, and Cybernetics.
[10] Peter Auer,et al. Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..
[11] Alon Orlitsky,et al. On Nearest-Neighbor Error-Correcting Output Codes with Application to All-Pairs Multiclass Support Vector Machines , 2003, J. Mach. Learn. Res..
[12] Shie Mannor,et al. A Geometric Approach to Multi-Criterion Reinforcement Learning , 2004, J. Mach. Learn. Res..
[13] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[14] Sridhar Mahadevan,et al. Automatic Programming of Behavior-Based Robots Using Reinforcement Learning , 1991, Artif. Intell..
[15] Kumpati S. Narendra,et al. Learning automata - an introduction , 1989 .
[16] Long-Ji Lin,et al. Self-improving reactive agents: case studies of reinforcement learning frameworks , 1991 .
[17] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .
[18] Richard O. Duda,et al. Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.
[19] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[20] Jerry M. Mendel,et al. Reinforcement-learning control and pattern recognition systems , 1994 .
[21] James A. Hendler,et al. Planning in Uncertain, Unpredictable or Changing Environments , 1990 .
[22] Gene F. Franklin,et al. Feedback Control of Dynamic Systems , 1986 .
[23] Richard S. Sutton,et al. Associative search network: A reinforcement learning associative memory , 1981, Biological Cybernetics.
[24] Paul J. Werbos,et al. Neural networks for control and system identification , 1989, Proceedings of the 28th IEEE Conference on Decision and Control,.
[25] P. Kumar,et al. Optimal adaptive controllers for unknown Markov chains , 1982 .
[26] Richard S. Sutton,et al. Time-Derivative Models of Pavlovian Reinforcement , 1990 .
[27] Arthur L. Samuel,et al. Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..
[28] Paul J. Werbos,et al. Building and Understanding Adaptive Systems: A Statistical/Numerical Approach to Factory Automation and Brain Research , 1987, IEEE Transactions on Systems, Man, and Cybernetics.
[29] Katsuhiko Ogata,et al. Modern Control Engineering , 1970 .
[30] Richard S. Sutton,et al. Reinforcement Learning , 1992, Handbook of Machine Learning.
[31] Ian H. Witten,et al. An Adaptive Optimal Controller for Discrete-Time Markov Environments , 1977, Inf. Control..
[32] Peter E. Hart,et al. Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.
[33] Virgil W. Eveleigh,et al. Adaptive Control And Optimization Techniques , 1967 .
[34] Andrew G. Barto,et al. On the Computational Economics of Reinforcement Learning , 1991 .
[35] P. Mandl,et al. Estimation and control in Markov chains , 1974, Advances in Applied Probability.
[36] V. Borkar,et al. Adaptive control of Markov chains, I: Finite parameter set , 1979, 1979 18th IEEE Conference on Decision and Control including the Symposium on Adaptive Processes.
[37] A. Jalali,et al. Computationally efficient adaptive control algorithms for Markov chains , 1989, Proceedings of the 28th IEEE Conference on Decision and Control,.
[38] Jean-Arcady Meyer,et al. Self-improving Reactive Agents: Case Studies of Reinforcement Learning Frameworks , 1991 .
[39] P. W. Jones,et al. Bandit Problems, Sequential Allocation of Experiments , 1987 .
[40] L. Baird,et al. A MATHEMATICAL ANALYSIS OF ACTOR-CRITIC ARCHITECTURES FOR LEARNING OPTIMAL CONTROLS THROUGH INCREMENTAL DYNAMIC PROGRAMMING (cid:3) , 1990 .
[41] Andrew G. Barto,et al. Connectionist learning for control: an overview , 1990 .
[42] Richard S. Sutton,et al. Training and Tracking in Robotics , 1985, IJCAI.
[43] Douglas C. Hittle,et al. Synthesis of reinforcement learning, neural networks and PI control applied to a simulated heating coil , 1997, Artificial Intelligence in Engineering.
[44] Andrew G. Barto,et al. Reinforcement learning , 1998 .
[45] C.W. Anderson,et al. Learning to control an inverted pendulum using neural networks , 1989, IEEE Control Systems Magazine.
[46] John J. Grefenstette,et al. Learning Sequential Decision Rules Using Simulation Models and Competition , 1990, Machine Learning.
[47] Richard E. Korf,et al. Real-Time Heuristic Search , 1990, Artif. Intell..
[48] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[49] Andrew G. Barto,et al. Reinforcement Learning and Dynamic Programming , 1995 .
[50] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[51] Richard Wheeler,et al. Decentralized learning in finite Markov chains , 1985, 1985 24th IEEE Conference on Decision and Control.