Trajectory-based Optimal Control Techniques the Basics of Learning Control
暂无分享,去创建一个
[1] Geoffrey J. Gordon,et al. Finding Approximate POMDP solutions Through Belief Compression , 2011, J. Artif. Intell. Res..
[2] Christopher G. Atkeson,et al. Control of a walking biped using a combination of simple policies , 2009, 2009 9th IEEE-RAS International Conference on Humanoid Robots.
[3] Sanjiv Singh,et al. The DARPA Urban Challenge: Autonomous Vehicles in City Traffic, George Air Force Base, Victorville, California, USA , 2009, The DARPA Urban Challenge.
[4] Christopher G. Atkeson,et al. Standing balance control using a trajectory library , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[5] Emanuel Todorov,et al. Efficient computation of optimal actions , 2009, Proceedings of the National Academy of Sciences.
[6] David Silver,et al. Learning to search: Functional gradient techniques for imitation learning , 2009, Auton. Robots.
[7] Jan Peters,et al. Learning motor primitives for robotics , 2009, 2009 IEEE International Conference on Robotics and Automation.
[8] Carl E. Rasmussen,et al. Gaussian process dynamic programming , 2009, Neurocomputing.
[9] Duy Nguyen-Tuong,et al. Local Gaussian Process Regression for Real Time Online Model Learning , 2008, NIPS.
[10] Jan Peters,et al. Fitted Q-iteration by Advantage Weighted Regression , 2008, NIPS.
[11] Jürgen Schmidhuber,et al. State-Dependent Exploration for Policy Gradient Methods , 2008, ECML/PKDD.
[12] Stefan Schaal,et al. A Bayesian approach to empirical local linearization for robotics , 2008, 2008 IEEE International Conference on Robotics and Automation.
[13] Stefan Schaal,et al. 2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .
[14] Stefan Schaal,et al. Learning to Control in Operational Space , 2008, Int. J. Robotics Res..
[15] Christopher G. Atkeson,et al. Random Sampling of States in Dynamic Programming , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[16] Sanjiv Singh,et al. The 2005 DARPA Grand Challenge: The Great Robot Race , 2007 .
[17] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.
[18] Stefan Schaal,et al. The New Robotics—towards Human-centered Machines , 2007 .
[19] C. Atkeson. Randomly Sampling Actions In Dynamic Programming , 2007, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning.
[20] H. Kappen. An introduction to stochastic control theory, path integrals and reinforcement learning , 2007 .
[21] Stefan Schaal,et al. Policy Gradient Methods for Robotics , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[22] Marc Toussaint,et al. Probabilistic inference for solving discrete and continuous state Markov Decision Processes , 2006, ICML.
[23] Jun Morimoto,et al. Learning CPG-based Biped Locomotion with a Policy Gradient Method: Application to a Humanoid Robot , 2005, 5th IEEE-RAS International Conference on Humanoid Robots, 2005..
[24] Stefan Schaal,et al. Incremental Online Learning in High Dimensions , 2005, Neural Computation.
[25] Pierre Geurts,et al. Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..
[26] H. Kappen. Linear theory for control of nonlinear stochastic systems. , 2004, Physical review letters.
[27] H. Sebastian Seung,et al. Stochastic policy gradient reinforcement learning on a simple 3D biped , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).
[28] Jessica K. Hodgins,et al. Synthesizing physically realistic human motion in low-dimensional, behavior-specific spaces , 2004, ACM Trans. Graph..
[29] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[30] Yoshihiko Nakamura,et al. Embodied Symbol Emergence Based on Mimesis Theory , 2004, Int. J. Robotics Res..
[31] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[32] Stefan Schaal,et al. http://www.jstor.org/about/terms.html. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained , 2007 .
[33] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.
[34] Andrew W. Moore,et al. Variable Resolution Discretization in Optimal Control , 2002, Machine Learning.
[35] Stefan Schaal,et al. Learning inverse kinematics , 2001, Proceedings 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. Expanding the Societal Role of Robotics in the the Next Millennium (Cat. No.01CH37180).
[36] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.
[37] Jon Rigelsford,et al. Modelling and Control of Robot Manipulators , 2000 .
[38] Michael I. Jordan,et al. PEGASUS: A policy search method for large MDPs and POMDPs , 2000, UAI.
[39] Jun Morimoto,et al. Acquisition of stand-up behavior by a real robot using hierarchical reinforcement learning , 2000, Robotics Auton. Syst..
[40] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[41] Shun-ichi Amari,et al. Natural Gradient Learning for Over- and Under-Complete Bases in ICA , 1999, Neural Computation.
[42] AmariShun-Ichi,et al. Natural Gradient Learning for Over-and Under-Complete Bases in ICA , 1999 .
[43] Stefan Schaal,et al. Is imitation learning the route to humanoid robots? , 1999, Trends in Cognitive Sciences.
[44] Christopher G. Atkeson,et al. Constructive Incremental Learning from Only Local Information , 1998, Neural Computation.
[45] D M Wolpert,et al. Multiple paired forward and inverse models for motor control , 1998, Neural Networks.
[46] Stefan Schaal,et al. Robot Learning From Demonstration , 1997, ICML.
[47] J. Spall,et al. Optimal random perturbations for stochastic approximation using a simultaneous perturbation gradient approximation , 1997, Proceedings of the 1997 American Control Conference (Cat. No.97CH36041).
[48] Stefan Schaal,et al. Learning tasks from a single demonstration , 1997, Proceedings of International Conference on Robotics and Automation.
[49] Andrew W. Moore,et al. Locally Weighted Learning for Control , 1997, Artificial Intelligence Review.
[50] Andrew W. Moore,et al. Locally Weighted Learning , 1997, Artificial Intelligence Review.
[51] Stefan Schaal,et al. Learning from Demonstration , 1996, NIPS.
[52] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[53] Michael I. Jordan,et al. Supervised learning from incomplete data via an EM approach , 1993, NIPS.
[54] S. Grossberg,et al. A Self-Organizing Neural Model of Motor Equivalent Reaching and Tool Use by a Multijoint Arm , 1993, Journal of Cognitive Neuroscience.
[55] Michael I. Jordan,et al. Forward Models: Supervised Learning with a Distal Teacher , 1992, Cogn. Sci..
[56] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[57] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[58] W. Cleveland. Robust Locally Weighted Regression and Smoothing Scatterplots , 1979 .
[59] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[60] Yaakov Bar-Shalom,et al. Caution, Probing, and the Value of Information in the Control of Uncertain Systems , 1976 .
[61] Y. Bar-Shalom,et al. Wide-sense adaptive dual control for nonlinear stochastic systems , 1973 .
[62] M. Ciletti,et al. The computation and theory of optimal control , 1972 .
[63] David Q. Mayne,et al. Differential dynamic programming , 1972, The Mathematical Gazette.
[64] Henk Nijmeijer,et al. Robot Programming by Demonstration , 2010, SIMPAR.
[65] Warren B. Powell,et al. Handbook of Learning and Approximate Dynamic Programming , 2006, IEEE Transactions on Automatic Control.
[66] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[67] Jun Nakanishi,et al. Learning Attractor Landscapes for Learning Motor Primitives , 2002, NIPS.
[68] Jonathan Baxter,et al. Scaling Internal-State Policy-Gradient Methods for POMDPs , 2002 .
[69] Manfred Opper,et al. Sparse Representation for Gaussian Process Models , 2000, NIPS.
[70] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[71] Kenji Doya,et al. Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.
[72] Geoffrey E. Hinton,et al. Using EM for Reinforcement Learning , 2000 .
[73] Ieee Robotics,et al. IEEE robotics & automation magazine , 1994 .
[74] M. Kawato,et al. Trajectory formation of arm movement by a neural network with forward and inverse dynamics models , 1993 .
[75] Vijaykumar Gullapalli,et al. A stochastic reinforcement learning algorithm for learning real-valued functions , 1990, Neural Networks.
[76] Christopher G. Atkeson,et al. Using Local Models to Control Movement , 1989, NIPS.