Efficient robust policy optimization
暂无分享,去创建一个
[1] R. Stephenson. A and V , 1962, The British journal of ophthalmology.
[2] M. Ciletti,et al. The computation and theory of optimal control , 1972 .
[3] David Q. Mayne,et al. Differential dynamic programming , 1972, The Mathematical Gazette.
[4] Douglas C. Montgomery,et al. Using common random numbers in simulation experiments — an approach to statistical analysis , 1976 .
[5] Ronald J. Williams,et al. Adaptive state representation and estimation using recurrent connectionist networks , 1990 .
[6] Richard S. Sutton,et al. Neural networks for control , 1990 .
[7] Donald A. Sofge,et al. Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches , 1992 .
[8] Paul J. Werbos,et al. The Roots of Backpropagation: From Ordered Derivatives to Neural Networks and Political Forecasting , 1994 .
[9] A. Varga,et al. Optimal output feedback control: a multi-model approach , 1996, Proceedings of Joint Conference on Control Applications Intelligent Control and Computer Aided Control System Design.
[10] Lee A. Feldkamp,et al. Fixed-weight controller for multiple systems , 1997, Proceedings of International Conference on Neural Networks (ICNN'97).
[11] Tariq Samad,et al. Neuro-Control Design: Optimization Aspects , 1997 .
[12] R. Longchamp,et al. A Minimax Approach for Multi-Objective Controller Design Using Multiple Models , 1999 .
[13] Andreas Griewank,et al. Evaluating derivatives - principles and techniques of algorithmic differentiation, Second Edition , 2000, Frontiers in applied mathematics.
[14] Michael I. Jordan,et al. PEGASUS: A policy search method for large MDPs and POMDPs , 2000, UAI.
[15] Jeff G. Schneider,et al. Autonomous helicopter control using reinforcement learning policy search methods , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).
[16] Kurt E. Häggblom,et al. Application of robust and multimodel control methods to an ill-conditioned distillation column , 2002 .
[17] Jennie Si,et al. Backpropagation Through Time and Derivative Adaptive CriticsA Common Framework for ComparisonPortions of this chapter were previously published in [4, 7,9, 1214,23]. , 2004 .
[18] Warren B. Powell,et al. Handbook of Learning and Approximate Dynamic Programming , 2006, IEEE Transactions on Automatic Control.
[19] P. Werbos. Backwards Differentiation in AD and Neural Nets: Past Links and New Opportunities , 2006 .
[20] Danil V. Prokhorov. Training Recurrent Neurocontrollers for Robustness With Derivative-Free Kalman Filter , 2006, IEEE Transactions on Neural Networks.
[21] Matthew McNaughton,et al. CASTRO: robust nonlinear trajectory optimization using multiple models , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[22] A. Poznyak,et al. Min-Max Output Integral Sliding Mode Control for Multiplant Linear Uncertain Systems , 2007, 2007 American Control Conference.
[23] J. Shinar,et al. Solution of a Linear Pursuit-Evasion Game with Variable Structure and Uncertain Dynamics , 2007 .
[24] Frank L. Lewis,et al. Guest Editorial: Special Issue on Adaptive Dynamic Programming and Reinforcement Learning in Feedback Control , 2008, IEEE Trans. Syst. Man Cybern. Part B.
[25] Christopher G. Atkeson,et al. Random Sampling of States in Dynamic Programming , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[26] Huaguang Zhang,et al. Adaptive Dynamic Programming: An Introduction , 2009, IEEE Computational Intelligence Magazine.
[27] Luigi Fortuna,et al. Reinforcement Learning and Adaptive Dynamic Programming for Feedback Control , 2009 .
[28] A. Poznyak,et al. The dynamic programming approach to multi-model robust optimization , 2010 .
[29] Christopher G. Atkeson,et al. Physical human interaction for an inflatable manipulator , 2011, 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.
[30] Frank L. Lewis,et al. Special issue on approximate dynamic programming and reinforcement learning , 2011 .
[31] C. Atkeson. Efficient Robust Policy Optimization ( Long Version ) , 2012 .