Reinforcement Learning for Partially Observable Dynamic Processes: Adaptive Dynamic Programming Using Measured Output Data
暂无分享,去创建一个
[1] Leiba Rodman,et al. Algebraic Riccati equations , 1995 .
[2] Miroslav Krstic,et al. Stabilization of Nonlinear Uncertain Systems , 1998 .
[3] Joe Brewer,et al. Kronecker products and matrix calculus in system theory , 1978 .
[4] Michael G. Safonov,et al. The unfalsified control concept and learning , 1994, Proceedings of 1994 33rd IEEE Conference on Decision and Control.
[5] Richard W. Longman,et al. State-Space System Identification with Identified Hankel Matrix , 1998 .
[6] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[7] Sergio M. Savaresi,et al. Virtual reference feedback tuning for two degree of freedom controllers , 2002 .
[8] R. Skelton,et al. Markov Data-Based LQG Control , 2000 .
[9] Frank L. Lewis,et al. Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[10] Richard W. Longman,et al. Unifying Input-Output and State-Space Perspectives of Predictive Control , 1998 .
[11] Huaguang Zhang,et al. Adaptive Dynamic Programming: An Introduction , 2009, IEEE Computational Intelligence Magazine.
[12] P. Lancaster,et al. The Algebraic Riccati Equation , 1995 .
[13] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[14] Peter Dayan,et al. A Neural Substrate of Prediction and Reward , 1997, Science.
[15] Svante Gunnarsson,et al. Iterative feedback tuning: theory and applications , 1998 .
[16] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[17] Paul J. Werbos,et al. Approximate dynamic programming for real-time control and neural modeling , 1992 .
[18] Victor M. Becerra,et al. Optimal control , 2008, Scholarpedia.
[19] S. M. Savaresi,et al. Virtual reference feedback tuning for two degree of freedom controllers , 2001, 2001 European Control Conference (ECC).
[20] M. Krstić,et al. Optimal design of adaptive tracking controllers for nonlinear systems , 1997, Proceedings of the 1997 American Control Conference (Cat. No.97CH36041).
[21] Frank L. Lewis,et al. Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem , 2010, Autom..
[22] Andrew G. Barto,et al. Adaptive linear quadratic control using policy iteration , 1994, Proceedings of 1994 American Control Conference - ACC '94.
[23] Kenji Doya,et al. Neural mechanisms of learning and control , 2001 .
[24] B. Widrow,et al. Adaptive inverse control , 1987, Proceedings of 8th IEEE International Symposium on Intelligent Control.
[25] Jeffrey Bennighof,et al. Minimum time Pulse Response Based Control of flexible structures , 1991 .
[26] Sean P. Meyn,et al. Q-learning and Pontryagin's Minimum Principle , 2009, Proceedings of the 48h IEEE Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control Conference.
[27] Warren B. Powell,et al. Approximate Dynamic Programming - Solving the Curses of Dimensionality , 2007 .
[28] F.L. Lewis,et al. Reinforcement learning and adaptive dynamic programming for feedback control , 2009, IEEE Circuits and Systems Magazine.
[29] Warren B. Powell,et al. Handbook of Learning and Approximate Dynamic Programming , 2006, IEEE Transactions on Automatic Control.
[30] Frank L. Lewis,et al. Online policy iteration based algorithms to solve the continuous-time infinite horizon optimal control problem , 2009, 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning.
[31] Minh Q. Phan,et al. Identification of a Multistep-Ahead Observer and Its Application to Predictive Control , 1997 .
[32] O.H. Bosgra,et al. Suppressing non-periodically repeating disturbances in mechanical servo systems , 1998, Proceedings of the 37th IEEE Conference on Decision and Control (Cat. No.98CH36171).
[33] J. Spall,et al. Model-free control of nonlinear stochastic systems with discrete-time measurements , 1998, IEEE Trans. Autom. Control..
[34] Panos M. Pardalos,et al. Approximate dynamic programming: solving the curses of dimensionality , 2009, Optim. Methods Softw..
[35] P. Werbos,et al. Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .
[36] Paul J. Werbos,et al. Neural networks for control and system identification , 1989, Proceedings of the 28th IEEE Conference on Decision and Control,.
[37] Xi-Ren Cao. Stochastic Learning and Optimization , 2007 .
[38] W. Schultz. Neural coding of basic reward terms of animal learning theory, game theory, microeconomics and behavioural ecology , 2004, Current Opinion in Neurobiology.
[39] Frank L. Lewis,et al. Guest Editorial: Special Issue on Adaptive Dynamic Programming and Reinforcement Learning in Feedback Control , 2008, IEEE Trans. Syst. Man Cybern. Part B.
[40] M. Steinbuch,et al. Data-based optimal control , 2005, Proceedings of the 2005, American Control Conference, 2005..
[41] Draguna Vrabie,et al. Adaptive optimal controllers based on Generalized Policy Iteration in a continuous-time framework , 2009, 2009 17th Mediterranean Conference on Control and Automation.
[42] Kenji Doya,et al. Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.
[43] Frank L. Lewis,et al. Special issue on approximate dynamic programming and reinforcement learning , 2011 .
[44] Jeffrey K. Bennighof,et al. Minimum time Pulse Response Based Control of flexible structures , 1991 .
[45] G. Hewer. An iterative technique for the computation of the steady state gains for the discrete optimal regulator , 1971 .