Planning Approximate Exploration Trajectories for Model-Free Reinforcement Learning in Contact-Rich Manipulation
暂无分享,去创建一个
Marc Toussaint | Daniel Hennes | Zhongyu Lou | Sabrina Hoppe | Marc Toussaint | Daniel Hennes | Sabrina Hoppe | Zhongyu Lou
[1] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[2] Peter L. Bartlett,et al. Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning , 2001, J. Mach. Learn. Res..
[3] Peter Englert,et al. Learning manipulation skills from a single demonstration , 2018, Int. J. Robotics Res..
[4] M. J. D. Powell,et al. An efficient method for finding the minimum of a function of several variables without calculating derivatives , 1964, Comput. J..
[5] Zoubin Ghahramani,et al. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.
[6] Marc Peter Deisenroth,et al. Deep Reinforcement Learning: A Brief Survey , 2017, IEEE Signal Processing Magazine.
[7] Martin A. Riedmiller,et al. Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards , 2017, ArXiv.
[8] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[9] Stefan Schaal,et al. Learning from Demonstration , 1996, NIPS.
[10] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[11] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[12] Thomas J. Walsh,et al. Knows what it knows: a framework for self-aware learning , 2008, ICML '08.
[13] Sergey Levine,et al. Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection , 2016, Int. J. Robotics Res..
[14] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[15] Nuttapong Chentanez,et al. Intrinsically Motivated Reinforcement Learning , 2004, NIPS.
[16] Filip De Turck,et al. #Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning , 2016, NIPS.
[17] W. Marsden. I and J , 2012 .
[18] Marcin Andrychowicz,et al. Parameter Space Noise for Exploration , 2017, ICLR.
[19] Filip De Turck,et al. VIME: Variational Information Maximizing Exploration , 2016, NIPS.
[20] Tsuyoshi Murata,et al. {m , 1934, ACML.
[21] Benjamin Van Roy,et al. Deep Exploration via Bootstrapped DQN , 2016, NIPS.
[22] Aaas News,et al. Book Reviews , 1893, Buffalo Medical and Surgical Journal.
[23] Nima Fazeli,et al. Learning Data-Efficient Rigid-Body Contact Models: Case Study of Planar Impact , 2017, CoRL.
[24] Andrew L. Maas. Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .
[25] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[26] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[27] Sham M. Kakade,et al. Plan Online, Learn Offline: Efficient Learning and Exploration via Model-Based Control , 2018, ICLR.
[28] Hado van Hasselt,et al. Double Q-learning , 2010, NIPS.
[29] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[30] Pierre-Yves Oudeyer,et al. Exploration in Model-based Reinforcement Learning by Empirically Estimating Learning Progress , 2012, NIPS.