A Survey on Policy Search for Robotics
暂无分享,去创建一个
[1] A. A. Feldbaum,et al. DUAL CONTROL THEORY, IV , 1961 .
[2] John A. Nelder,et al. A Simplex Method for Function Minimization , 1965, Comput. J..
[3] B. Anderson,et al. Optimal Filtering , 1979, IEEE Transactions on Systems, Man, and Cybernetics.
[4] E. B. Andersen,et al. Information Science and Statistics , 1986 .
[5] Keith Glover,et al. Robust control design using normal-ized coprime factor plant descriptions , 1989 .
[6] W. Cleveland,et al. Locally Weighted Regression: An Approach to Regression Analysis by Local Fitting , 1988 .
[7] Christopher G. Atkeson,et al. Task-level robot learning: juggling a tennis ball more accurately , 1989, Proceedings, 1989 International Conference on Robotics and Automation.
[8] Karl Johan Åström,et al. Adaptive Control , 1989, Embedded Digital Control with Microcontrollers.
[9] J. Aplevich,et al. Lecture Notes in Control and Information Sciences , 1979 .
[10] R. J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[11] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .
[12] Björn Wittenmark,et al. Adaptive Dual Control Methods: An Overview , 1995 .
[13] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[14] Jeff G. Schneider,et al. Exploiting Model Uncertainty Estimates for Safe Dynamic Control Learning , 1996, NIPS.
[15] David K. Smith,et al. Dynamic Programming and Optimal Control. Volume 1 , 1996 .
[16] Geoffrey E. Hinton,et al. Using Expectation-Maximization for Reinforcement Learning , 1997, Neural Computation.
[17] Christopher G. Atkeson,et al. A comparison of direct and model-based reinforcement learning , 1997, Proceedings of International Conference on Robotics and Automation.
[18] Andrew G. Barto,et al. Reinforcement learning , 1998 .
[19] Visakan Kadirkamanathan,et al. Dual adaptive control of nonlinear stochastic systems using neural networks , 1998, Autom..
[20] Shun-ichi Amari,et al. Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.
[21] Christopher G. Atkeson,et al. Constructive Incremental Learning from Only Local Information , 1998, Neural Computation.
[22] Geoffrey E. Hinton,et al. A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.
[23] Thomas G. Dietterich. Adaptive computation and machine learning , 1998 .
[24] Shigenobu Kobayashi,et al. Efficient Non-Linear Control by Combining Q-learning with Local Linear Controllers , 1999, ICML.
[25] Justin A. Boyan,et al. Least-Squares Temporal Difference Learning , 1999, ICML.
[26] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[27] P. Bartlett,et al. Direct Gradient-Based Reinforcement Learning: I. Gradient Estimation Algorithms , 1999 .
[28] Kenji Doya,et al. Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.
[29] Michael I. Jordan,et al. PEGASUS: A policy search method for large MDPs and POMDPs , 2000, UAI.
[30] J. Baxter,et al. Direct gradient-based reinforcement learning , 2000, 2000 IEEE International Symposium on Circuits and Systems. Emerging Technologies for the 21st Century. Proceedings (IEEE Cat No.00CH36353).
[31] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[32] Jeff G. Schneider,et al. Autonomous helicopter control using reinforcement learning policy search methods , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).
[33] Jun Nakanishi,et al. Learning Attractor Landscapes for Learning Motor Primitives , 2002, NIPS.
[34] Sebastian Thrun,et al. Probabilistic robotics , 2002, CACM.
[35] Thomas G. Dietterich,et al. Editors. Advances in Neural Information Processing Systems , 2002 .
[36] Rémi Coulom,et al. Reinforcement Learning Using Neural Networks, with Applications to Motor Control. (Apprentissage par renforcement utilisant des réseaux de neurones, avec des applications au contrôle moteur) , 2002 .
[37] Jun Morimoto,et al. Minimax Differential Dynamic Programming: An Application to Robust Biped Walking , 2002, NIPS.
[38] S. Shankar Sastry,et al. Autonomous Helicopter Flight via Reinforcement Learning , 2003, NIPS.
[39] Petros Koumoutsakos,et al. Reducing the Time Complexity of the Derandomized Evolution Strategy with Covariance Matrix Adaptation (CMA-ES) , 2003, Evolutionary Computation.
[40] Stefan Schaal,et al. Reinforcement Learning for Humanoid Robotics , 2003 .
[41] Jeff G. Schneider,et al. Covariant policy search , 2003, IJCAI 2003.
[42] Jun Nakanishi,et al. Learning Movement Primitives , 2005, ISRR.
[43] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[44] A. Pacut,et al. Model-free off-policy reinforcement learning in continuous environment , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).
[45] Jeffrey K. Uhlmann,et al. Unscented filtering and nonlinear estimation , 2004, Proceedings of the IEEE.
[46] Peter Stone,et al. Policy gradient reinforcement learning for fast quadrupedal locomotion , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.
[47] Ben Tse,et al. Autonomous Inverted Helicopter Flight via Reinforcement Learning , 2004, ISER.
[48] David J. C. MacKay,et al. Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.
[49] Carl E. Rasmussen,et al. A Unifying View of Sparse Approximate Gaussian Process Regression , 2005, J. Mach. Learn. Res..
[50] Zoubin Ghahramani,et al. Sparse Gaussian Processes using Pseudo-inputs , 2005, NIPS.
[51] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[52] Pieter Abbeel,et al. Using inaccurate models in reinforcement learning , 2006, ICML.
[53] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.
[54] Emanuel Todorov,et al. Optimal Control Theory , 2006 .
[55] Christopher M. Bishop,et al. Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .
[56] C. W. Chan,et al. Performance evaluation of UKF-based nonlinear filtering , 2006, Autom..
[57] Stefan Schaal,et al. Policy Gradient Methods for Robotics , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[58] Chang Shu,et al. Numerical comparison of least square-based finite-difference (LSFD) and radial basis function-based finite-difference (RBFFD) methods , 2006, Comput. Math. Appl..
[59] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.
[60] Stefan Schaal,et al. Applying the Episodic Natural Actor-Critic Architecture to Motor Primitive Learning , 2007, ESANN.
[61] Dieter Fox,et al. Gaussian Processes and Reinforcement Learning for Identification and Control of an Autonomous Blimp , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.
[62] Jürgen Schmidhuber,et al. State-Dependent Exploration for Policy Gradient Methods , 2008, ECML/PKDD.
[63] Jan Peters,et al. Fitted Q-iteration by Advantage Weighted Regression , 2008, NIPS.
[64] Frank Sehnke,et al. Policy Gradients with Parameter-Based Exploration for Control , 2008, ICANN.
[65] Jun Morimoto,et al. Learning CPG-based Biped Locomotion with a Policy Gradient Method: Application to a Humanoid Robot , 2005, 5th IEEE-RAS International Conference on Humanoid Robots, 2005..
[66] Betty J. Mohler,et al. Learning perceptual coupling for motor primitives , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[67] Jun Nakanishi,et al. A Unifying Methodology for Robot Control with Redundant DOFs , 2008 .
[68] Tom Schaul,et al. Natural Evolution Strategies , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).
[69] Jan Peters,et al. Policy Search for Motor Primitives in Robotics , 2008, NIPS 2008.
[70] Jun Nakanishi,et al. A unifying framework for robot control with redundant DOFs , 2007, Auton. Robots.
[71] Stefan Schaal,et al. 2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .
[72] Tom Schaul,et al. Efficient natural evolution strategies , 2009, GECCO.
[73] Marc Toussaint,et al. Model-free reinforcement learning as mixture learning , 2009, ICML '09.
[74] Marc Toussaint,et al. Robot trajectory optimization using approximate inference , 2009, ICML '09.
[75] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.
[76] Tapani Raiko,et al. Variational Bayesian learning of nonlinear hidden state-space models for model predictive control , 2009, Neurocomputing.
[77] Marc Toussaint,et al. Learning model-free robot control by a Monte Carlo EM algorithm , 2009, Auton. Robots.
[78] Carl E. Rasmussen,et al. Gaussian process dynamic programming , 2009, Neurocomputing.
[79] Jan Peters,et al. Model Learning with Local Gaussian Process Regression , 2009, Adv. Robotics.
[80] Christian Igel,et al. Hoeffding and Bernstein races for selecting policies in evolutionary direct policy search , 2009, ICML '09.
[81] Verena Heidrich-Meisner,et al. Neuroevolution strategies for episodic reinforcement learning , 2009, J. Algorithms.
[82] Christoph H. Lampert,et al. Movement templates for learning of hitting and batting , 2010, 2010 IEEE International Conference on Robotics and Automation.
[83] Darwin G. Caldwell,et al. Robot motor skill coordination with EM-based Reinforcement Learning , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[84] Yasemin Altun,et al. Relative Entropy Policy Search , 2010 .
[85] Tom Schaul,et al. Exploring parameter space in reinforcement learning , 2010, Paladyn J. Behav. Robotics.
[86] Marc Peter Deisenroth,et al. Efficient reinforcement learning using Gaussian processes , 2010 .
[87] Frank Sehnke,et al. Parameter-exploring policy gradients , 2010, Neural Networks.
[88] Stefan Schaal,et al. A Generalized Path Integral Control Approach to Reinforcement Learning , 2010, J. Mach. Learn. Res..
[89] Carl E. Rasmussen,et al. Learning to Control a Low-Cost Manipulator using Data-Efficient Reinforcement Learning , 2011, Robotics: Science and Systems.
[90] Carl E. Rasmussen,et al. PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.
[91] Gerhard Neumann,et al. Variational Inference for Policy Search in changing situations , 2011, ICML.
[92] Jan Peters,et al. Reinforcement Learning to Adjust Robot Movements to New Situations , 2010, IJCAI.
[93] Jan Peters,et al. Learning elementary movements jointly with a higher level task , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[94] Jan Peters,et al. Learning concurrent motor skills in versatile solution spaces , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[95] Jan Peters,et al. Hierarchical Relative Entropy Policy Search , 2014, AISTATS.
[96] Olivier Sigaud,et al. Path Integral Policy Improvement with Covariance Matrix Adaptation , 2012, ICML.
[97] David B. Dunson,et al. Multiresolution Gaussian Processes , 2012, NIPS.
[98] Carme Torras,et al. Learning Collaborative Impedance-Based Robot Behaviors , 2013, AAAI.
[99] Jan Peters,et al. Data-Efficient Generalization of Robot Skills with Contextual Policy Search , 2013, AAAI.
[100] Shinichi Hirai,et al. Robust real time material classification algorithm using soft three axis tactile sensor: Evaluation of the algorithm , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[101] Richard S. Sutton,et al. Reinforcement Learning , 1992, Handbook of Machine Learning.