暂无分享,去创建一个
Martin A. Riedmiller | Nir Levine | Abbas Abdolmaleki | Jost Tobias Springenberg | Martin Riedmiller | Todd Hester | Daniel J. Mankowitz | Timothy Mann | Rae Jeong | A. Abdolmaleki | D. Mankowitz | Todd Hester | Rae Jeong | Nir Levine | J. T. Springenberg | Timothy Mann
[1] Laurent El Ghaoui,et al. Robust Control of Markov Decision Processes with Uncertain Transition Matrices , 2005, Oper. Res..
[2] Daniel Kuhn,et al. Robust Markov Decision Processes , 2013, Math. Oper. Res..
[3] Shie Mannor,et al. Policy Gradients with Variance Related Risk Criteria , 2012, ICML.
[4] Shie Mannor,et al. Learning Robust Options , 2018, AAAI.
[5] Jakub W. Pachocki,et al. Learning dexterous in-hand manipulation , 2018, Int. J. Robotics Res..
[6] Jun Morimoto,et al. Robust Reinforcement Learning , 2005, Neural Computation.
[7] Henry Zhu,et al. Soft Actor-Critic Algorithms and Applications , 2018, ArXiv.
[8] Yuval Tassa,et al. Maximum a Posteriori Policy Optimisation , 2018, ICLR.
[9] OpenAI. Learning Dexterous In-Hand Manipulation. , 2018 .
[10] Divyam Rastogi,et al. Sample-efficient Reinforcement Learning via Difference Models , 2018 .
[11] Shie Mannor,et al. Scaling Up Robust MDPs using Function Approximation , 2014, ICML.
[12] Sergey Levine,et al. QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation , 2018, CoRL.
[13] Shie Mannor,et al. A Bayesian Approach to Robust Reinforcement Learning , 2019, UAI.
[14] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[15] Shie Mannor,et al. A Deep Hierarchical Approach to Lifelong Learning in Minecraft , 2016, AAAI.
[16] Tom Schaul,et al. Unicorn: Continual Learning with a Universal, Off-policy Agent , 2018, ArXiv.
[17] Stefan Schaal,et al. Path integral control and bounded rationality , 2011, 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).
[18] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[19] Pieter Abbeel,et al. Equivalence Between Policy Gradients and Soft Q-Learning , 2017, ArXiv.
[20] Pieter Abbeel,et al. Mutual Alignment Transfer Learning , 2017, CoRL.
[21] Sergey Levine,et al. Learning modular neural network policies for multi-task and multi-robot transfer , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[22] Gabriel Dulac-Arnold,et al. Challenges of Real-World Reinforcement Learning , 2019, ArXiv.
[23] Yee Whye Teh,et al. Distral: Robust multitask reinforcement learning , 2017, NIPS.
[24] Daan Wierstra,et al. Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.
[25] Razvan Pascanu,et al. Policy Distillation , 2015, ICLR.
[26] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[27] Dale Schuurmans,et al. Bridging the Gap Between Value and Policy Based Reinforcement Learning , 2017, NIPS.
[28] Jordi Grau-Moya,et al. Planning with Information-Processing Constraints and Model Uncertainty in Markov Decision Processes , 2016, ECML/PKDD.
[29] Jim Gao,et al. Machine Learning Applications for Data Center Optimization , 2014 .
[30] Marcin Andrychowicz,et al. Sim-to-Real Transfer of Robotic Control with Dynamics Randomization , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[31] Yuval Tassa,et al. Learning Continuous Control Policies by Stochastic Value Gradients , 2015, NIPS.
[32] Yuval Tassa,et al. DeepMind Control Suite , 2018, ArXiv.
[33] Shie Mannor,et al. Soft-Robust Actor-Critic Policy-Gradient , 2018, UAI.
[34] Wojciech Zaremba,et al. Transfer from Simulation to Real World through Learning Deep Inverse Dynamics Model , 2016, ArXiv.
[35] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.
[36] Martin A. Riedmiller,et al. Learning by Playing - Solving Sparse Reward Tasks from Scratch , 2018, ICML.
[37] Roy Fox,et al. Taming the Noise in Reinforcement Learning via Soft Updates , 2015, UAI.
[38] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[39] Garud Iyengar,et al. Robust Dynamic Programming , 2005, Math. Oper. Res..
[40] Yuval Tassa,et al. Relative Entropy Regularized Policy Iteration , 2018, ArXiv.
[41] Naftali Tishby,et al. Trading Value and Information in MDPs , 2012 .