暂无分享,去创建一个
Honglak Lee | John Canny | Sergio Guadarrama | Kuang-Huei Lee | Anthony Liu | Yijie Guo | Ian Fischer | Anthony Z. Liu | J. Canny | Honglak Lee | S. Guadarrama | Yijie Guo | Ian Fischer | Kuang-Huei Lee
[1] Ruben Villegas,et al. Learning Latent Dynamics for Planning from Pixels , 2018, ICML.
[2] Richard S. Sutton,et al. Dyna, an integrated architecture for learning, planning, and reacting , 1990, SGAR.
[3] Sergey Levine,et al. Reinforcement Learning with Deep Energy-Based Policies , 2017, ICML.
[4] Sergey Levine,et al. Model-Based Reinforcement Learning for Atari , 2019, ICLR.
[5] Marc G. Bellemare,et al. DeepMDP: Learning Continuous Latent Space Models for Representation Learning , 2019, ICML.
[6] Oriol Vinyals,et al. Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.
[7] Sergey Levine,et al. When to Trust Your Model: Model-Based Policy Optimization , 2019, NeurIPS.
[8] Pieter Abbeel,et al. Reinforcement Learning with Augmented Data , 2020, NeurIPS.
[9] Sergey Levine,et al. InfoBot: Transfer and Exploration via the Information Bottleneck , 2019, ICLR.
[10] Sam Devlin,et al. Generalization in Reinforcement Learning with Selective Noise Injection and Information Bottleneck , 2019, NeurIPS.
[11] Yuval Tassa,et al. DeepMind Control Suite , 2018, ArXiv.
[12] Ohad Shamir,et al. Learning and generalization with the information bottleneck , 2008, Theor. Comput. Sci..
[13] Raef Bassily,et al. Learners that Use Little Information , 2017, ALT.
[14] Ian S. Fischer,et al. The Conditional Entropy Bottleneck , 2020, Entropy.
[15] Jürgen Schmidhuber,et al. Reinforcement Learning in Markovian and Non-Markovian Environments , 1990, NIPS.
[16] E. Jaynes. Information Theory and Statistical Mechanics , 1957 .
[17] Yoshua Bengio,et al. Unsupervised State Representation Learning in Atari , 2019, NeurIPS.
[18] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[19] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[20] Shankar Krishnan,et al. Filter Response Normalization Layer: Eliminating Batch Dependence in the Training of Deep Neural Networks , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[21] J. Andrew Bagnell,et al. Modeling Purposeful Adaptive Behavior with the Principle of Maximum Causal Entropy , 2010 .
[22] Naftali Tishby,et al. Predictive Information , 1999, cond-mat/9902341.
[23] Alexander A. Alemi,et al. CEB Improves Model Robustness , 2020, Entropy.
[24] Alexei A. Efros,et al. Large-Scale Study of Curiosity-Driven Learning , 2018, ICLR.
[25] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[26] Henry Zhu,et al. Soft Actor-Critic Algorithms and Applications , 2018, ArXiv.
[27] Trevor Darrell,et al. Loss is its own Reward: Self-Supervision for Reinforcement Learning , 2016, ICLR.
[28] Sergey Levine,et al. Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models , 2018, NeurIPS.
[29] Mohammad Norouzi,et al. Dream to Control: Learning Behaviors by Latent Imagination , 2019, ICLR.
[30] J. Schmidhuber. Making the World Differentiable: On Using Self-Supervised Fully Recurrent Neural Networks for Dynamic Reinforcement Learning and Planning in Non-Stationary Environm~nts , 2018 .
[31] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[32] Joelle Pineau,et al. Improving Sample Efficiency in Model-Free Reinforcement Learning from Images , 2019, ArXiv.
[33] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[34] Jürgen Schmidhuber,et al. On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models , 2015, ArXiv.
[35] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[36] Yoshua Bengio,et al. Learning deep representations by mutual information estimation and maximization , 2018, ICLR.
[37] Yoshua Bengio,et al. The Variational Bandwidth Bottleneck: Stochastic Evaluation on an Information Budget , 2020, ICLR.
[38] Doina Precup,et al. An information-theoretic approach to curiosity-driven reinforcement learning , 2012, Theory in Biosciences.
[39] Sergey Levine,et al. Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model , 2019, NeurIPS.
[40] Alexander A. Alemi,et al. On Variational Bounds of Mutual Information , 2019, ICML.
[41] Ilya Kostrikov,et al. Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels , 2020, ArXiv.
[42] Pieter Abbeel,et al. CURL: Contrastive Unsupervised Representations for Reinforcement Learning , 2020, ICML.