Policy Search for Motor Primitives in Robotics
暂无分享,去创建一个
[1] Donald E. Kirk,et al. Optimal Control Theory , 1970 .
[2] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[3] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[4] Christopher G. Atkeson,et al. Using Local Trajectory Optimizers to Speed Up Global Optimization in Dynamic Programming , 1993, NIPS.
[5] C. Sumners. Toys in Space: Exploring Science with the Astronauts , 1993 .
[6] Yasuhiro Masutani,et al. Mastering of a Task with Interaction between a Robot and Its Environment. "Kendama" Task. , 1993 .
[7] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .
[8] Mitsuo Kawato,et al. Teaching by Showing in Kendama Based on Optimization Principle , 1994 .
[9] V. Gullapalli,et al. Acquiring robot skills via reinforcement learning , 1994, IEEE Control Systems.
[10] S. Schaal,et al. A Kendama Learning Robot Based on Bi-directional Theory , 1996, Neural Networks.
[11] G. McLachlan,et al. The EM algorithm and extensions , 1996 .
[12] Geoffrey E. Hinton,et al. Using Expectation-Maximization for Reinforcement Learning , 1997, Neural Computation.
[13] Andrew G. Barto,et al. Reinforcement learning , 1998 .
[14] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[15] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[16] Michael I. Jordan,et al. PEGASUS: A policy search method for large MDPs and POMDPs , 2000, UAI.
[17] Jürgen Schmidhuber,et al. Gradient-based Reinforcement Planning in Policy-Search Methods , 2001, ArXiv.
[18] Rogelio Lozano,et al. Non-linear Control for Underactuated Mechanical Systems , 2001 .
[19] Andrew W. Moore,et al. Direct Policy Search using Paired Statistical Tests , 2001, ICML.
[20] Jun Nakanishi,et al. Learning Attractor Landscapes for Learning Motor Primitives , 2002, NIPS.
[21] Jun Nakanishi,et al. Movement imitation with nonlinear dynamical systems in humanoid robots , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).
[22] Leslie Pack Kaelbling,et al. Reinforcement Learning by Policy Search , 2002 .
[23] Jun Nakanishi,et al. Control, Planning, Learning, and Imitation with Dynamic Movement Primitives , 2003 .
[24] Stefan Schaal,et al. Reinforcement Learning for Humanoid Robotics , 2003 .
[25] Jeff G. Schneider,et al. Covariant Policy Search , 2003, IJCAI.
[26] Hagai Attias,et al. Planning by Probabilistic Inference , 2003, AISTATS.
[27] Jeff G. Schneider,et al. Policy Search by Dynamic Programming , 2003, NIPS.
[28] Noah J. Cowan,et al. Efficient Gradient Estimation for Motor Control Learning , 2002, UAI.
[29] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[30] H. Sebastian Seung,et al. Stochastic policy gradient reinforcement learning on a simple 3D biped , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).
[31] Nando de Freitas,et al. An Introduction to MCMC for Machine Learning , 2004, Machine Learning.
[32] Stefan Schaal,et al. Scalable Techniques from Nonparametric Statistics for Real Time Robot Learning , 2002, Applied Intelligence.
[33] Ben Tse,et al. Autonomous Inverted Helicopter Flight via Reinforcement Learning , 2004, ISER.
[34] Stuart J. Russell,et al. Adaptive Probabilistic Networks with Hidden Variables , 1997, Machine Learning.
[35] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[36] Stefan Schaal,et al. Policy Gradient Methods for Robotics , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[37] Marc Carreras,et al. Towards Direct Policy Search Reinforcement Learning for Robot Control , 2005, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[38] Aude Billard,et al. Reinforcement learning for imitating constrained reaching movements , 2007, Adv. Robotics.
[39] Stefan Schaal,et al. Reinforcement learning by reward-weighted regression for operational space control , 2007, ICML '07.
[40] Nando de Freitas,et al. Bayesian Policy Learning with Trans-Dimensional MCMC , 2007, NIPS.
[41] Shimon Whiteson,et al. Transfer via inter-task mappings in policy search reinforcement learning , 2007, AAMAS '07.
[42] G. Wulf,et al. Attention and Motor Skill Learning , 2007 .
[43] Marc Toussaint,et al. Probabilistic inference for structured planning in robotics , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[44] Stefan Schaal,et al. Dynamics systems vs. optimal control--a unifying view. , 2007, Progress in brain research.
[45] Jürgen Schmidhuber,et al. State-Dependent Exploration for Policy Gradient Methods , 2008, ECML/PKDD.
[46] Jan Peters,et al. Using Bayesian Dynamical Systems for Motion Template Libraries , 2008, NIPS.
[47] Stefan Schaal,et al. Movement reproduction and obstacle avoidance with dynamic movement primitives and potential fields , 2008, Humanoids 2008 - 8th IEEE-RAS International Conference on Humanoid Robots.
[48] Betty J. Mohler,et al. Learning perceptual coupling for motor primitives , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[49] Jan Peters,et al. Machine Learning for motor skills in robotics , 2008, Künstliche Intell..
[50] Jan Peters,et al. Learning motor primitives for robotics , 2009, 2009 IEEE International Conference on Robotics and Automation.
[51] Javier de Lope,et al. The kNN-TD Reinforcement Learning Algorithm , 2009 .
[52] Eric O. Postma,et al. Dimensionality Reduction: A Comparative Review , 2008 .
[53] Marc Toussaint,et al. Learning model-free robot control by a Monte Carlo EM algorithm , 2009, Auton. Robots.
[54] Frank Sehnke,et al. Parameter-exploring policy gradients , 2010, Neural Networks.