PLATO: Policy learning using adaptive trajectory optimization
暂无分享,去创建一个
Sergey Levine | Pieter Abbeel | Tianhao Zhang | Gregory Kahn | S. Levine | P. Abbeel | G. Kahn | Tianhao Zhang
[1] E. Todorov,et al. A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems , 2005, Proceedings of the 2005, American Control Conference, 2005..
[2] Martin J. Wainwright,et al. Divergences, surrogate loss functions and experimental design , 2005, NIPS.
[3] David Q. Mayne,et al. Robust model predictive control of constrained linear systems with bounded disturbances , 2005, Autom..
[4] Ian D. Reid,et al. Real-Time SLAM Relocalisation , 2007, 2007 IEEE 11th International Conference on Computer Vision.
[5] Philippe Martin,et al. The true role of accelerometer feedback in quadrotor control , 2010, 2010 IEEE International Conference on Robotics and Automation.
[6] Yasemin Altun,et al. Relative Entropy Policy Search , 2010 .
[7] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[8] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[9] He He,et al. Imitation Learning by Coaching , 2012, NIPS.
[10] Vicenç Gómez,et al. Optimal control as a graphical model inference problem , 2009, Machine Learning.
[11] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[12] Martial Hebert,et al. Learning monocular reactive UAV control in cluttered natural environments , 2012, 2013 IEEE International Conference on Robotics and Automation.
[13] V. Climenhaga. Markov chains and mixing times , 2013 .
[14] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[15] Sergey Levine,et al. Guided Policy Search , 2013, ICML.
[16] Jan Peters,et al. A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.
[17] Sergey Levine,et al. Variational Policy Search via Trajectory Optimization , 2013, NIPS.
[18] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.
[19] Honglak Lee,et al. Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning , 2014, NIPS.
[20] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[21] Sergey Levine,et al. Learning Neural Network Policies with Guided Policy Search under Unknown Dynamics , 2014, NIPS.
[22] Yuval Tassa,et al. Learning Continuous Control Policies by Stochastic Value Gradients , 2015, NIPS.
[23] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[24] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[25] Jianxiong Xiao,et al. DeepDriving: Learning Affordance for Direct Perception in Autonomous Driving , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[26] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[27] Sergey Levine,et al. Learning deep control policies for autonomous aerial vehicles with MPC-guided policy search , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).
[28] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[29] Jürgen Schmidhuber,et al. A Machine Learning Approach to Visual Perception of Forest Trails for Mobile Robots , 2016, IEEE Robotics and Automation Letters.
[30] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..