暂无分享,去创建一个
Sham M. Kakade | Emanuel Todorov | Aravind Rajeswaran | Igor Mordatch | Kendall Lowrey | S. Kakade | E. Todorov | Igor Mordatch | A. Rajeswaran | Kendall Lowrey
[1] Marc G. Bellemare,et al. Safe and Efficient Off-Policy Reinforcement Learning , 2016, NIPS.
[2] Emanuel Todorov,et al. Universal Convexification via Risk-Aversion , 2014, UAI.
[3] Allan Jabri,et al. Universal Planning Networks , 2018, ICML.
[4] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[5] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[6] J. Andrew Bagnell,et al. Agnostic System Identification for Model-Based Reinforcement Learning , 2012, ICML.
[7] Ramkrishna Pasumarthy,et al. Identifying Topology of Power Distribution Networks Based on Smart Meter Data , 2016, ArXiv.
[8] S. Joe Qin,et al. A survey of industrial model predictive control technology , 2003 .
[9] Sergey Levine,et al. Model-Based Value Estimation for Efficient Model-Free Reinforcement Learning , 2018, ArXiv.
[10] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[11] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[12] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[13] Yuval Tassa,et al. Synthesis and stabilization of complex behaviors through online trajectory optimization , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[14] Stefan Schaal,et al. A Generalized Path Integral Control Approach to Reinforcement Learning , 2010, J. Mach. Learn. Res..
[15] David Barber,et al. Thinking Fast and Slow with Deep Learning and Tree Search , 2017, NIPS.
[16] Emanuel Todorov,et al. Reinforcement learning for non-prehensile manipulation: Transfer from simulation to physical system , 2018, 2018 IEEE International Conference on Simulation, Modeling, and Programming for Autonomous Robots (SIMPAR).
[17] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[18] Marcin Andrychowicz,et al. Sim-to-Real Transfer of Robotic Control with Dynamics Randomization , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[19] Sergey Levine,et al. Guided Policy Search , 2013, ICML.
[20] Vikash Kumar,et al. Manipulators and Manipulation in high dimensional spaces , 2016 .
[21] E. Todorov,et al. A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems , 2005, Proceedings of the 2005, American Control Conference, 2005..
[22] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[23] Sham M. Kakade,et al. Towards Generalization and Simplicity in Continuous Control , 2017, NIPS.
[24] David Q. Mayne,et al. Differential dynamic programming , 1972, The Mathematical Gazette.
[25] Lennart Ljung,et al. System Identification: Theory for the User , 1987 .
[26] Emanuel Todorov,et al. Compositionality of optimal control laws , 2009, NIPS.
[27] Ramkrishna Pasumarthy,et al. A novel approach for phase identification in smart grids using Graph Theory and Principal Component Analysis , 2015, 2016 American Control Conference (ACC).
[28] Alexandre M. Bayen,et al. Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines , 2018, ICLR.
[29] Yuval Tassa,et al. Value function approximation and model predictive control , 2013, 2013 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).
[30] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[31] Perttu Hämäläinen,et al. Augmenting sampling based controllers with machine learning , 2017, Symposium on Computer Animation.
[32] Sergey Levine,et al. Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations , 2017, Robotics: Science and Systems.
[33] Sergey Levine,et al. Divide-and-Conquer Reinforcement Learning , 2017, ICLR.
[34] Honglak Lee,et al. Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion , 2018, NeurIPS.
[35] Nuttapong Chentanez,et al. Intrinsically Motivated Reinforcement Learning , 2004, NIPS.
[36] James M. Rehg,et al. Aggressive driving with model predictive path integral control , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).
[37] Filip De Turck,et al. VIME: Variational Information Maximizing Exploration , 2016, NIPS.
[38] Kamyar Azizzadenesheli,et al. Sample-Efficient Deep RL with Generative Adversarial Tree Search , 2018, ArXiv.
[39] Christopher G. Atkeson,et al. Using Local Trajectory Optimizers to Speed Up Global Optimization in Dynamic Programming , 1993, NIPS.
[40] Sergey Levine,et al. Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models , 2015, ArXiv.
[41] Csaba Szepesvári,et al. Finite-Time Bounds for Fitted Value Iteration , 2008, J. Mach. Learn. Res..
[42] Byron Boots,et al. Dual Policy Iteration , 2018, NeurIPS.
[43] Benjamin Van Roy,et al. (More) Efficient Reinforcement Learning via Posterior Sampling , 2013, NIPS.
[44] Sergey Levine,et al. Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[45] Shane Legg,et al. Noisy Networks for Exploration , 2017, ICLR.
[46] Emanuel Todorov,et al. Combining the benefits of function approximation and trajectory optimization , 2014, Robotics: Science and Systems.
[47] Manfred Morari,et al. Model predictive control: Theory and practice - A survey , 1989, Autom..
[48] Jakub W. Pachocki,et al. Learning dexterous in-hand manipulation , 2018, Int. J. Robotics Res..
[49] Marcin Andrychowicz,et al. Parameter Space Noise for Exploration , 2017, ICLR.
[50] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..
[51] Jeff G. Schneider,et al. Policy Search by Dynamic Programming , 2003, NIPS.
[52] Balaraman Ravindran,et al. EPOpt: Learning Robust Neural Network Policies Using Model Ensembles , 2016, ICLR.
[53] Albin Cassirer,et al. Randomized Prior Functions for Deep Reinforcement Learning , 2018, NeurIPS.
[54] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[55] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 2002, Machine Learning.
[56] Wei Chu,et al. A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.
[57] M. Hoagland,et al. Feedback Systems An Introduction for Scientists and Engineers SECOND EDITION , 2015 .
[58] Yuval Tassa,et al. Control-limited differential dynamic programming , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).
[59] Byron Boots,et al. Truncated Horizon Policy Search: Combining Reinforcement Learning & Imitation Learning , 2018, ICLR.
[60] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[61] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).