Continuous action reinforcement learning for control-affine systems with unknown dynamics
暂无分享,去创建一个
[1] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[2] D. Ernst,et al. Approximate Value Iteration in the Reinforcement Learning Context. Application to Electrical Power System Control. , 2005 .
[3] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[4] Steven M. LaValle,et al. Planning algorithms , 2006 .
[5] Frank L. Lewis,et al. A Neural Network Solution for Fixed-Final Time Optimal Control of Nonlinear Systems , 2006, 2006 14th Mediterranean Conference on Control and Automation.
[6] Kimura Kimura. Reinforcement learning in multi-dimensional state-action space using random rectangular coarse coding and gibbs sampling , 2007, SICE Annual Conference 2007.
[7] Csaba Szepesvári,et al. Fitted Q-iteration in continuous action-space MDPs , 2007, NIPS.
[8] Andrea Bonarini,et al. Reinforcement Learning in Continuous Action Spaces through Sequential Monte Carlo Methods , 2007, NIPS.
[9] Csaba Szepesvári,et al. Finite-Time Bounds for Fitted Value Iteration , 2008, J. Mach. Learn. Res..
[10] Frank L. Lewis,et al. Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[11] Victor Uc Cetina,et al. Reinforcement learning in continuous state and action spaces , 2009 .
[12] J. Lévine. Analysis and Control of Nonlinear Systems: A Flatness-based Approach , 2009 .
[13] Sarangapani Jagannathan,et al. Decentralized nearly optimal control of a class of interconnected nonlinear discrete-time systems by using online Hamilton-Bellman-Jacobi formulation , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).
[14] Thomas J. Walsh,et al. Integrating Sample-Based Planning and Model-Based Reinforcement Learning , 2010, AAAI.
[15] Bart De Schutter,et al. Reinforcement Learning and Dynamic Programming Using Function Approximators , 2010 .
[16] Michael L. Littman,et al. Sample-Based Planning for Continuous Action Markov Decision Processes , 2011, ICAPS.
[17] Csaba Szepesvári,et al. –armed Bandits , 2022 .
[18] Sarangapani Jagannathan,et al. Decentralized Optimal Control of a Class of Interconnected Nonlinear Discrete-Time Systems by Using Online Hamilton-Jacobi-Bellman Formulation , 2011, IEEE Transactions on Neural Networks.
[19] Anthony J. Calise,et al. Derivative-free decentralized adaptive control of large-scale interconnected uncertain systems , 2011, IEEE Conference on Decision and Control and European Control Conference.
[20] Warren E. Dixon,et al. Asymptotic tracking by a reinforcement learning-based adaptive critic controller , 2011 .
[21] Robert Babuska,et al. A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[22] Sarangapani Jagannathan,et al. Online Optimal Control of Affine Nonlinear Discrete-Time Systems With Unknown Internal Dynamics by Using Time-Based Policy Update , 2012, IEEE Transactions on Neural Networks and Learning Systems.
[23] Anthony Cowley,et al. Parsing Indoor Scenes Using RGB-D Imagery , 2012, Robotics: Science and Systems.
[24] Zhong-Ping Jiang,et al. Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics , 2012, Autom..
[25] Robert Babuska,et al. Optimistic planning for continuous-action deterministic systems , 2013, 2013 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).
[26] Lydia Tapia,et al. Learning swing-free trajectories for UAVs with a suspended load , 2013, 2013 IEEE International Conference on Robotics and Automation.
[27] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..
[28] P. Olver. Nonlinear Systems , 2013 .
[29] F. Lewis,et al. A policy iteration approach to online optimal control of continuous-time constrained-input systems. , 2013, ISA transactions.
[30] F. Lewis,et al. Online adaptive algorithm for optimal control with integral reinforcement learning , 2014 .
[31] Barry D. Nichols. Reinforcement learning in continuous state- and action-space , 2014 .