Online learning control based on projected gradient temporal difference and advanced heuristic dynamic programming
暂无分享,去创建一个
[1] Frank L. Lewis,et al. Online actor critic algorithm to solve the continuous-time infinite horizon optimal control problem , 2009, 2009 International Joint Conference on Neural Networks.
[2] F.L. Lewis,et al. Reinforcement learning and adaptive dynamic programming for feedback control , 2009, IEEE Circuits and Systems Magazine.
[3] Derong Liu,et al. Adaptive Dynamic Programming for Optimal Tracking Control of Unknown Nonlinear Systems With Application to Coal Gasification , 2014, IEEE Transactions on Automation Science and Engineering.
[4] Xin Xu,et al. Reinforcement learning algorithms with function approximation: Recent advances and applications , 2014, Inf. Sci..
[5] Haibo He,et al. A three-network architecture for on-line learning and optimization based on adaptive dynamic programming , 2012, Neurocomputing.
[6] Haibo He,et al. Adaptive Learning and Control for MIMO System Based on Adaptive Dynamic Programming , 2011, IEEE Transactions on Neural Networks.
[7] Shalabh Bhatnagar,et al. Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation , 2009, NIPS.
[8] Sanqing Hu,et al. MAC protocol identification approach for implement smart cognitive radio , 2012, 2012 IEEE International Conference on Communications (ICC).
[9] Frank L. Lewis,et al. Reinforcement learning and optimal adaptive control: An overview and implementation examples , 2012, Annu. Rev. Control..
[10] Jinyu Wen,et al. Adaptive Learning in Tracking Control Based on the Dual Critic Network Design , 2013, IEEE Transactions on Neural Networks and Learning Systems.
[11] Jennie Si,et al. Online learning control by association and reinforcement , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.
[12] Haibo He. Self-Adaptive Systems for Machine Intelligence: He/Machine Intelligence , 2011 .
[13] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.
[14] Frank L. Lewis,et al. Adaptive Dynamic Programming for feedback control , 2009, 2009 7th Asian Control Conference.
[15] Frank L. Lewis,et al. Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach , 2005, Autom..
[16] Haibo He. Self-Adaptive Systems for Machine Intelligence , 2011 .
[17] John N. Tsitsiklis,et al. Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.
[18] Warrren B Powell,et al. A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications , 2011 .
[19] Xin Xu,et al. Kernel-Based Representation Policy Iteration with Applications to Optimal Path Tracking of Wheeled Mobile Robots , 2013, IScIDE.
[20] Huaguang Zhang,et al. Adaptive Dynamic Programming: An Introduction , 2009, IEEE Computational Intelligence Magazine.