暂无分享,去创建一个
Honglak Lee | John Canny | Sergio Guadarrama | Kuang-Huei Lee | Yijie Guo | Anthony Z. Liu | J. Canny | Honglak Lee | S. Guadarrama | Yijie Guo | Ian Fischer | Kuang-Huei Lee | Ian Fischer | Anthony Liu
[1] E. Jaynes. Information Theory and Statistical Mechanics , 1957 .
[2] Jimmy Ba,et al. Dream to Control: Learning Behaviors by Latent Imagination , 2019, ICLR.
[3] Jürgen Schmidhuber,et al. Reinforcement Learning in Markovian and Non-Markovian Environments , 1990, NIPS.
[4] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[5] Henry Zhu,et al. Soft Actor-Critic Algorithms and Applications , 2018, ArXiv.
[6] Marc G. Bellemare,et al. DeepMDP: Learning Continuous Latent Space Models for Representation Learning , 2019, ICML.
[7] Ohad Shamir,et al. Learning and generalization with the information bottleneck , 2008, Theor. Comput. Sci..
[8] Sergey Levine,et al. Model-Based Reinforcement Learning for Atari , 2019, ICLR.
[9] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[10] Alexander A. Alemi,et al. CEB Improves Model Robustness , 2020, Entropy.
[11] J. Andrew Bagnell,et al. Modeling Purposeful Adaptive Behavior with the Principle of Maximum Causal Entropy , 2010 .
[12] Sam Devlin,et al. Generalization in Reinforcement Learning with Selective Noise Injection and Information Bottleneck , 2019, NeurIPS.
[13] Yoshua Bengio,et al. Unsupervised State Representation Learning in Atari , 2019, NeurIPS.
[14] Naftali Tishby,et al. Predictive Information , 1999, cond-mat/9902341.
[15] Saurabh Singh,et al. Filter Response Normalization Layer: Eliminating Batch Dependence in the Training of Deep Neural Networks , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[16] Alexander A. Alemi,et al. On Variational Bounds of Mutual Information , 2019, ICML.
[17] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[18] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[19] Sergey Levine,et al. Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models , 2018, NeurIPS.
[20] R. Fergus,et al. Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels , 2020, ICLR.
[21] Doina Precup,et al. An information-theoretic approach to curiosity-driven reinforcement learning , 2012, Theory in Biosciences.
[22] Sergey Levine,et al. Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model , 2019, NeurIPS.
[23] P. Abbeel,et al. Reinforcement Learning with Augmented Data , 2020, NeurIPS.
[24] Pieter Abbeel,et al. CURL: Contrastive Unsupervised Representations for Reinforcement Learning , 2020, ICML.
[25] Richard S. Sutton,et al. Dyna, an integrated architecture for learning, planning, and reacting , 1990, SGAR.
[26] Joelle Pineau,et al. Improving Sample Efficiency in Model-Free Reinforcement Learning from Images , 2019, AAAI.
[27] Sergey Levine,et al. When to Trust Your Model: Model-Based Policy Optimization , 2019, NeurIPS.
[28] Sergey Levine,et al. Reinforcement Learning with Deep Energy-Based Policies , 2017, ICML.
[29] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[30] Yuval Tassa,et al. DeepMind Control Suite , 2018, ArXiv.
[31] Oriol Vinyals,et al. Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.
[32] Yoshua Bengio,et al. The Variational Bandwidth Bottleneck: Stochastic Evaluation on an Information Budget , 2020, ICLR.
[33] Sergey Levine,et al. InfoBot: Transfer and Exploration via the Information Bottleneck , 2019, ICLR.
[34] Alexei A. Efros,et al. Large-Scale Study of Curiosity-Driven Learning , 2018, ICLR.
[35] Trevor Darrell,et al. Loss is its own Reward: Self-Supervision for Reinforcement Learning , 2016, ICLR.
[36] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[37] Ian S. Fischer,et al. The Conditional Entropy Bottleneck , 2020, Entropy.
[38] Yoshua Bengio,et al. Learning deep representations by mutual information estimation and maximization , 2018, ICLR.
[39] Jürgen Schmidhuber,et al. On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models , 2015, ArXiv.
[40] J. Schmidhuber. Making the World Differentiable: On Using Self-Supervised Fully Recurrent Neural Networks for Dynamic Reinforcement Learning and Planning in Non-Stationary Environm~nts , 2018 .
[41] Raef Bassily,et al. Learners that Use Little Information , 2017, ALT.
[42] Ruben Villegas,et al. Learning Latent Dynamics for Planning from Pixels , 2018, ICML.