Relative Variational Intrinsic Control
暂无分享,去创建一个
David Warde-Farley | Volodymyr Mnih | Kate Baumli | Steven Hansen | Volodymyr Mnih | David Warde-Farley | S. Hansen | Kate Baumli
[1] David Warde-Farley,et al. Fast Task Inference with Variational Intrinsic Successor Features , 2019, ICLR.
[2] Demis Hassabis,et al. Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm , 2017, ArXiv.
[3] Yuval Tassa,et al. DeepMind Control Suite , 2018, ArXiv.
[4] Chrystopher L. Nehaniv,et al. Empowerment: a universal agent-centric measure of control , 2005, 2005 IEEE Congress on Evolutionary Computation.
[5] David Barber,et al. Information Maximization in Noisy Channels : A Variational Approach , 2003, NIPS.
[6] Doina Precup,et al. Temporal abstraction in reinforcement learning , 2000, ICML 2000.
[7] Sergey Levine,et al. Diversity is All You Need: Learning Skills without a Reward Function , 2018, ICLR.
[8] Doina Precup,et al. What can I do here? A Theory of Affordances in Reinforcement Learning , 2020, ICML.
[9] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[10] Shimon Whiteson,et al. Protecting against evaluation overfitting in empirical reinforcement learning , 2011, 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).
[11] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[12] Eduardo F. Morales,et al. An Introduction to Reinforcement Learning , 2011 .
[13] Taehoon Kim,et al. Quantifying Generalization in Reinforcement Learning , 2018, ICML.
[14] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[15] 三嶋 博之. The theory of affordances , 2008 .
[16] Rémi Munos,et al. Recurrent Experience Replay in Distributed Reinforcement Learning , 2018, ICLR.
[17] Shakir Mohamed,et al. Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning , 2015, NIPS.
[18] Stefan Wermter,et al. Improving reinforcement learning with interactive feedback and affordances , 2014, 4th International Conference on Development and Learning and on Epigenetic Robotics.
[19] David Barber,et al. The IM algorithm: a variational approach to Information Maximization , 2003, NIPS 2003.
[20] David Warde-Farley,et al. Unsupervised Control Through Non-Parametric Discriminative Rewards , 2018, ICLR.
[21] Sergey Levine,et al. Dynamics-Aware Unsupervised Discovery of Skills , 2019, ICLR.
[22] Pieter Abbeel,et al. Variational Option Discovery Algorithms , 2018, ArXiv.
[23] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[24] Daan Wierstra,et al. Variational Intrinsic Control , 2016, ICLR.
[25] Doina Precup,et al. When Waiting is not an Option : Learning Options with a Deliberation Cost , 2017, AAAI.
[26] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..