Optimal and Autonomous Control Using Reinforcement Learning: A Survey
暂无分享,去创建一个
Frank L. Lewis | Kyriakos G. Vamvoudakis | Hamidreza Modares | Bahare Kiumarsi | F. Lewis | K. Vamvoudakis | H. Modares | Bahare Kiumarsi
[1] Anup Parikh,et al. Adaptive control of a surface marine craft with parameter identification using integral concurrent learning , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).
[2] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[3] Marcus Johnson,et al. Approximate $N$ -Player Nonzero-Sum Game Solution for an Uncertain Continuous Nonlinear System , 2015, IEEE Transactions on Neural Networks and Learning Systems.
[4] Robert Tibshirani,et al. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.
[5] S. Sastry,et al. Adaptive Control: Stability, Convergence and Robustness , 1989 .
[6] Warren E. Dixon,et al. Model-based reinforcement learning for approximate optimal regulation , 2016, Autom..
[7] Ales Ude,et al. Programming full-body movements for humanoid robots by observation , 2004, Robotics Auton. Syst..
[8] Paul J. Werbos,et al. Neural networks for control and system identification , 1989, Proceedings of the 28th IEEE Conference on Decision and Control,.
[9] Min Guo,et al. Reinforcement Learning Neural Network to the Problem of Autonomous Mobile Robot Obstacle Avoidance , 2005, 2005 International Conference on Machine Learning and Cybernetics.
[10] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[11] Nguyen Tan Luy,et al. Reinforecement learning-based optimal tracking control for wheeled mobile robot , 2012, 2012 IEEE International Conference on Cyber Technology in Automation, Control, and Intelligent Systems (CYBER).
[12] Frank L. Lewis,et al. Game-Theoretic Control of Active Loads in DC Microgrids , 2016, IEEE Transactions on Energy Conversion.
[13] Paul J. Werbos,et al. Approximate dynamic programming for real-time control and neural modeling , 1992 .
[14] Zhong-Ping Jiang,et al. Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics , 2012, Autom..
[15] Frank L. Lewis,et al. Optimal Tracking Control of Uncertain Systems , 2016 .
[16] Jingyuan Zhang,et al. Application of Artificial Neural Network Based on Q-learning for Mobile Robot Path Planning , 2006, 2006 IEEE International Conference on Information Acquisition.
[17] Frank L. Lewis,et al. 2009 Special Issue: Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems , 2009 .
[18] F. Lewis,et al. Model-free Q-learning designs for discrete-time zero-sum games with application to H-infinity control , 2007, 2007 European Control Conference (ECC).
[19] Zhicong Huang,et al. Adaptive impedance control of robotic exoskeletons using reinforcement learning , 2016, 2016 International Conference on Advanced Robotics and Mechatronics (ICARM).
[20] Xi-Ren Cao. Stochastic Learning and Optimization , 2007 .
[21] Derong Liu,et al. Adaptive Dynamic Programming for Control , 2012 .
[22] Nguyen Tan Luy,et al. Reinforcement learning-based robust adaptive tracking control for multi-wheeled mobile robots synchronization with optimality , 2013, 2013 IEEE Workshop on Robotic Intelligence in Informationally Structured Space (RiiSS).
[23] Kyriakos G. Vamvoudakis,et al. Online Optimal Operation of Parallel Voltage-Source Inverters Using Partial Information , 2017, IEEE Transactions on Industrial Electronics.
[24] John E. Laird,et al. Learning procedural knowledge through observation , 2001, K-CAP '01.
[25] Frank L. Lewis,et al. Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems , 2014, Autom..
[26] Marcin Szuster,et al. Discrete Globalised Dual Heuristic Dynamic Programming in Control of the Two-Wheeled Mobile Robot , 2014 .
[27] Yanhong Luo,et al. Approximate optimal control for a class of nonlinear discrete-time systems with saturating actuators , 2008 .
[28] Andrew G. Barto,et al. Adaptive linear quadratic control using policy iteration , 1994, Proceedings of 1994 American Control Conference - ACC '94.
[29] Drs Sustainment. Online optimal control of nonlinear discrete-time systems using approximate dynamic programming , 2011 .
[30] Hao Xu,et al. Neural network‐based finite horizon optimal adaptive consensus control of mobile robot formations , 2016 .
[31] Stefan Schaal,et al. Reinforcement Learning With Sequences of Motion Primitives for Robust Manipulation , 2012, IEEE Transactions on Robotics.
[32] Petros A. Ioannou,et al. Adaptive Control Tutorial (Advances in Design and Control) , 2006 .
[33] Haibo He,et al. Event-Triggered Adaptive Dynamic Programming for Continuous-Time Systems With Control Constraints , 2017, IEEE Transactions on Neural Networks and Learning Systems.
[34] Kevin P. Murphy,et al. Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.
[35] Hari Om Gupta,et al. Application of policy iteration technique based adaptive optimal control design for automatic voltage regulator of power system , 2014 .
[36] Frank L. Lewis,et al. $ {H}_{ {\infty }}$ Tracking Control of Completely Unknown Continuous-Time Systems via Off-Policy Reinforcement Learning , 2015, IEEE Transactions on Neural Networks and Learning Systems.
[37] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[38] Y. Matsuoka,et al. Reinforcement Learning and Synergistic Control of the ACT Hand , 2013, IEEE/ASME Transactions on Mechatronics.
[39] Frank L. Lewis,et al. Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics , 2014, Autom..
[40] Girish Chowdhary,et al. Concurrent learning adaptive control of linear systems with exponentially convergent bounds , 2013 .
[41] Sarangapani Jagannathan,et al. Approximate optimal distributed control of uncertain nonlinear interconnected systems with event-sampled feedback , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).
[42] Robert Kozma,et al. Complete stability analysis of a heuristic approximate dynamic programming control design , 2015, Autom..
[43] Sarangapani Jagannathan,et al. Distributed adaptive optimal regulation of uncertain large-scale interconnected systems using hybrid Q-learning approach , 2016 .
[44] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[45] Frank L. Lewis,et al. Linear Quadratic Tracking Control of Partially-Unknown Continuous-Time Systems Using Reinforcement Learning , 2014, IEEE Transactions on Automatic Control.
[46] Mathew Mithra Noel,et al. Nonlinear control of a boost converter using a robust regression based reinforcement learning algorithm , 2016, Eng. Appl. Artif. Intell..
[47] Avimanyu Sahoo,et al. Near Optimal Event-Triggered Control of Nonlinear Discrete-Time Systems Using Neurodynamic Programming , 2016, IEEE Transactions on Neural Networks and Learning Systems.
[48] Frank L. Lewis,et al. A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems , 2013, Autom..
[49] Xiaogang Ruan,et al. Application of reinforcement learning based on neural network to dynamic obstacle avoidance , 2008, 2008 International Conference on Information and Automation.
[50] Luigi Fortuna,et al. Reinforcement Learning and Adaptive Dynamic Programming for Feedback Control , 2009 .
[51] Frank L. Lewis,et al. Reinforcement Learning for Partially Observable Dynamic Processes: Adaptive Dynamic Programming Using Measured Output Data , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[52] Frank L. Lewis,et al. Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning , 2014, Autom..
[53] Naira Hovakimyan,et al. L1 Adaptive Control Theory - Guaranteed Robustness with Fast Adaptation , 2010, Advances in design and control.
[54] Ali Heydari,et al. Feedback Solution to Optimal Switching Problems With Switching Cost , 2014, IEEE Transactions on Neural Networks and Learning Systems.
[55] Ruey-Wen Liu,et al. Construction of Suboptimal Control Sequences , 1967 .
[56] R Bellman,et al. DYNAMIC PROGRAMMING AND LAGRANGE MULTIPLIERS. , 1956, Proceedings of the National Academy of Sciences of the United States of America.
[57] Gang Tao,et al. Adaptive Control Design and Analysis , 2003 .
[58] Michael I. Jordan,et al. Machine learning: Trends, perspectives, and prospects , 2015, Science.
[59] Ali Heydari,et al. Optimal Switching and Control of Nonlinear Switching Systems Using Approximate Dynamic Programming , 2014, IEEE Transactions on Neural Networks and Learning Systems.
[60] Robert F. Stengel,et al. Optimal Control and Estimation , 1994 .
[61] Frank L. Lewis,et al. Adaptive optimal control for continuous-time linear systems based on policy iteration , 2009, Autom..
[62] Ali Heydari,et al. Optimal scheduling for reference tracking or state regulation using reinforcement learning , 2015, J. Frankl. Inst..
[63] Pieter Abbeel,et al. Autonomous Helicopter Aerobatics through Apprenticeship Learning , 2010, Int. J. Robotics Res..
[64] Huai‐Ning Wu,et al. Computationally efficient simultaneous policy update algorithm for nonlinear H∞ state feedback control with Galerkin's method , 2013 .
[65] K. Vamvoudakis. Event-triggered optimal adaptive control algorithm for continuous-time nonlinear systems , 2014, IEEE/CAA Journal of Automatica Sinica.
[66] Martin T. Hagan,et al. Neural network design , 1995 .
[67] A. Billard,et al. Effects of repeated exposure to a humanoid robot on children with autism , 2004 .
[68] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[69] George M. Siouris,et al. Applied Optimal Control: Optimization, Estimation, and Control , 1979, IEEE Transactions on Systems, Man, and Cybernetics.
[70] R. Bellman,et al. Dynamic Programming and Markov Processes , 1960 .
[71] Paul J. Webros. A menu of designs for reinforcement learning over time , 1990 .
[72] Ahmad Ghanbari,et al. Neural Network Reinforcement Learning for Walking Control of a 3-Link Biped Robot , 2015 .
[73] Tingwen Huang,et al. Off-Policy Reinforcement Learning for $ H_\infty $ Control Design , 2013, IEEE Transactions on Cybernetics.
[74] Daniel Liberzon,et al. Calculus of Variations and Optimal Control Theory: A Concise Introduction , 2012 .
[75] Frank L. Lewis,et al. Multi-agent differential graphical games , 2011, Proceedings of the 30th Chinese Control Conference.
[76] Frank L. Lewis,et al. Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[77] G. Zames. Feedback and optimal sensitivity: Model reference transformations, multiplicative seminorms, and approximate inverses , 1981 .
[78] William M. McEneaney,et al. A Max-Plus-Based Algorithm for a Hamilton--Jacobi--Bellman Equation of Nonlinear Filtering , 2000, SIAM J. Control. Optim..
[79] Haibo He,et al. Adaptive Event-Triggered Control Based on Heuristic Dynamic Programming for Nonlinear Discrete-Time Systems , 2017, IEEE Transactions on Neural Networks and Learning Systems.
[80] W. Dixon. Optimal Adaptive Control and Differential Games by Reinforcement Learning Principles , 2014 .
[81] Frank L. Lewis,et al. H∞ control of linear discrete-time systems: Off-policy reinforcement learning , 2017, Autom..
[82] Derong Liu,et al. Integral Reinforcement Learning for Linear Continuous-Time Zero-Sum Games With Completely Unknown Dynamics , 2014, IEEE Transactions on Automation Science and Engineering.
[83] S. Kahne,et al. Optimal control: An introduction to the theory and ITs applications , 1967, IEEE Transactions on Automatic Control.
[84] Ali Heydari,et al. Optimal switching between autonomous subsystems , 2014, J. Frankl. Inst..
[85] Frank L. Lewis,et al. Cooperative Control of Multi-Agent Systems: Optimal and Adaptive Design Approaches , 2013 .
[86] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.
[87] P. Khargonekar,et al. STATESPACE SOLUTIONS TO STANDARD 2 H AND H? CONTROL PROBLEMS , 1989 .
[88] Zhong-Ping Jiang,et al. Adaptive dynamic programming and optimal control of nonlinear nonaffine systems , 2014, Autom..
[89] Frank L. Lewis,et al. H∞ Control of Nonaffine Aerial Systems Using Off-policy Reinforcement Learning , 2016, Unmanned Syst..
[90] Kenji Doya,et al. Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.
[91] Tamer Başar,et al. H1-Optimal Control and Related Minimax Design Problems , 1995 .
[92] Warren B. Powell,et al. Approximate Dynamic Programming I: Modeling , 2011 .
[93] Donald E. Kirk,et al. Optimal control theory : an introduction , 1970 .
[94] Warren E. Dixon,et al. Online Approximate Optimal Station Keeping of an Autonomous Underwater Vehicle , 2013, ArXiv.
[95] Sarangapani Jagannathan,et al. Distributed event-sampled approximate optimal control of interconnected affine nonlinear continuous-time systems , 2016, 2016 American Control Conference (ACC).
[96] Ali Heydari,et al. Optimal Switching of DC–DC Power Converters Using Approximate Dynamic Programming , 2018, IEEE Transactions on Neural Networks and Learning Systems.
[97] Chrystopher L. Nehaniv,et al. Teaching robots by moulding behavior and scaffolding the environment , 2006, HRI '06.
[98] Cheng-Wan An,et al. Mobile robot navigation using neural Q-learning , 2004, Proceedings of 2004 International Conference on Machine Learning and Cybernetics (IEEE Cat. No.04EX826).
[99] Frank L. Lewis,et al. Actor–Critic-Based Optimal Tracking for Partially Unknown Nonlinear Discrete-Time Systems , 2015, IEEE Transactions on Neural Networks and Learning Systems.
[100] Frank L. Lewis,et al. Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach , 2005, Autom..
[101] Warren E. Dixon,et al. Model-based reinforcement learning for on-line feedback-Nash equilibrium solution of N-player nonzero-sum differential games , 2014, 2014 American Control Conference.
[102] Thorsten Joachims,et al. Learning Trajectory Preferences for Manipulators via Iterative Improvement , 2013, NIPS.
[103] Kyriakos G. Vamvoudakis,et al. Non-zero sum Nash Q-learning for unknown deterministic continuous-time linear systems , 2015, Autom..
[104] Yu Jiang,et al. Robust Adaptive Dynamic Programming and Feedback Stabilization of Nonlinear Systems , 2014, IEEE Transactions on Neural Networks and Learning Systems.
[105] Frank L. Lewis,et al. Optimal Tracking Control of Unknown Discrete-Time Linear Systems Using Input-Output Measured Data , 2015, IEEE Transactions on Cybernetics.
[106] William M. McEneaney,et al. Max-plus methods for nonlinear control and estimation , 2005 .
[107] Chris Watkins,et al. Learning from delayed rewards , 1989 .
[108] Chrystopher L. Nehaniv,et al. Imitation with ALICE: learning to imitate corresponding actions across dissimilar embodiments , 2002, IEEE Trans. Syst. Man Cybern. Part A.
[109] Robert Babuška,et al. Policy Derivation Methods for Critic-Only Reinforcement Learning in Continuous Action Spaces , 2016 .
[110] A. Schaft. L/sub 2/-gain analysis of nonlinear systems and nonlinear state-feedback H/sub infinity / control , 1992 .
[111] Frank L. Lewis,et al. Online actor critic algorithm to solve the continuous-time infinite horizon optimal control problem , 2009, 2009 International Joint Conference on Neural Networks.
[112] Ali Heydari,et al. Optimal switching between controlled subsystems with free mode sequence , 2015, Neurocomputing.
[113] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[114] Robert Babuska,et al. Actor-critic reinforcement learning for tracking control in robotics , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).