Combining the benefits of function approximation and trajectory optimization
暂无分享,去创建一个
[1] Brian Kingsbury,et al. New types of deep neural network learning for speech recognition and related applications: an overview , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[2] Sergey Levine,et al. Variational Policy Search via Trajectory Optimization , 2013, NIPS.
[3] Pieter Abbeel,et al. An Application of Reinforcement Learning to Aerobatic Helicopter Flight , 2006, NIPS.
[4] Xiang Zhang,et al. OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.
[5] Simon Haykin,et al. GradientBased Learning Applied to Document Recognition , 2001 .
[6] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[7] Zoran Popovic,et al. Discovery of complex behaviors through contact-invariant optimization , 2012, ACM Trans. Graph..
[8] Zoran Popovic,et al. Contact-invariant optimization for hand manipulation , 2012, SCA '12.
[9] Stephen P. Boyd,et al. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..
[10] Yann LeCun,et al. Transformation Invariance in Pattern Recognition - Tangent Distance and Tangent Propagation , 2012, Neural Networks: Tricks of the Trade.
[11] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[12] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[13] Sergey Levine,et al. Guided Policy Search , 2013, ICML.
[14] Javier Alonso-Mora,et al. A message-passing algorithm for multi-agent trajectory planning , 2013, NIPS.
[15] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[16] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[17] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[18] E. Todorov,et al. A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems , 2005, Proceedings of the 2005, American Control Conference, 2005..
[19] Vladlen Koltun,et al. Animating human lower limbs using contact-invariant optimization , 2013, ACM Trans. Graph..
[20] Stefan Schaal,et al. Learning Policy Improvements with Path Integrals , 2010, AISTATS.
[21] Andrew P. Witkin,et al. Spacetime constraints , 1988, SIGGRAPH.
[22] Richard S. Sutton,et al. Neural networks for control , 1990 .
[23] Alexander G. Gray,et al. Stochastic Alternating Direction Method of Multipliers , 2013, ICML.
[24] Aaron Hertzmann,et al. Trajectory Optimization for Full-Body Movements with Complex Contacts , 2013, IEEE Transactions on Visualization and Computer Graphics.
[25] Marc Toussaint,et al. Robot trajectory optimization using approximate inference , 2009, ICML '09.
[26] Yuval Tassa,et al. Value function approximation and model predictive control , 2013, 2013 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).
[27] Sergey Levine,et al. Learning Complex Neural Network Policies with Trajectory Optimization , 2014, ICML.
[28] Hartmut Geyer,et al. A Muscle-Reflex Model That Encodes Principles of Legged Mechanics Produces Human Walking Dynamics and Muscle Activities , 2010, IEEE Transactions on Neural Systems and Rehabilitation Engineering.
[29] Vijay Kumar,et al. Trajectory Generation and Control for Precise Aggressive Maneuvers with Quadrotors , 2010, ISER.
[30] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[31] Claire J. Tomlin,et al. Quadrotor Helicopter Flight Dynamics and Control: Theory and Experiment , 2007 .
[32] Auke Jan Ijspeert,et al. Central pattern generators for locomotion control in animals and robots: A review , 2008, Neural Networks.
[33] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[34] Stefan Schaal,et al. http://www.jstor.org/about/terms.html. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained , 2007 .
[35] Yuval Tassa,et al. Synthesis and stabilization of complex behaviors through online trajectory optimization , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[36] Pascal Vincent,et al. Contractive Auto-Encoders: Explicit Invariance During Feature Extraction , 2011, ICML.
[37] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[38] Warren B. Powell,et al. Handbook of Learning and Approximate Dynamic Programming , 2006, IEEE Transactions on Automatic Control.
[39] Robert F. Stengel,et al. Optimal Control and Estimation , 1994 .
[40] Russ Tedrake,et al. A direct method for trajectory optimization of rigid bodies through contact , 2014, Int. J. Robotics Res..