暂无分享,去创建一个
Sergey Levine | Yoshua Bengio | Hugo Larochelle | Anirudh Goyal | Matthew Botvinick | Riashat Islam | Daniel Strouse | Zafarali Ahmed | Yoshua Bengio | S. Levine | H. Larochelle | M. Botvinick | Anirudh Goyal | Riashat Islam | Daniel Strouse | Zafarali Ahmed
[1] Jürgen Schmidhuber,et al. Curious model-building control systems , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.
[2] R. J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[3] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[4] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[5] Naftali Tishby,et al. The information bottleneck method , 2000, ArXiv.
[6] Andrew G. Barto,et al. Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density , 2001, ICML.
[7] E. Miller,et al. An integrative theory of prefrontal cortex function. , 2001, Annual review of neuroscience.
[8] Doina Precup,et al. Learning Options in Reinforcement Learning , 2002, SARA.
[9] Shie Mannor,et al. Q-Cut - Dynamic Discovery of Sub-goals in Reinforcement Learning , 2002, ECML.
[10] Peter Dayan,et al. Structure in the Space of Value Functions , 2002, Machine Learning.
[11] Alicia P. Wolfe,et al. Identifying useful subgoals in reinforcement learning by local graph partitioning , 2005, ICML.
[12] Michael L. Littman,et al. An analysis of model-based Interval Estimation for Markov Decision Processes , 2008, J. Comput. Syst. Sci..
[13] Hamid Beigy,et al. Using Strongly Connected Components as a Basis for Autonomous Skill Acquisition in Reinforcement Learning , 2009, ISNN.
[14] Daniel Polani,et al. Grounding subgoals in information transitions , 2011, 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).
[15] Renato Renner,et al. An intuitive proof of the data processing inequality , 2011, Quantum Inf. Comput..
[16] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[17] J. Kinney,et al. Equitability, mutual information, and the maximal information coefficient , 2013, Proceedings of the National Academy of Sciences.
[18] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[19] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[20] Filip De Turck,et al. VIME: Variational Information Maximizing Exploration , 2016, NIPS.
[21] Benjamin Van Roy,et al. Deep Exploration via Bootstrapped DQN , 2016, NIPS.
[22] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[23] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[24] Olivier Marre,et al. Relevant sparse codes with variational information bottleneck , 2016, NIPS.
[25] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.
[26] Filip De Turck,et al. #Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning , 2016, NIPS.
[27] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[28] Razvan Pascanu,et al. Imagination-Augmented Agents for Deep Reinforcement Learning , 2017, NIPS.
[29] Marlos C. Machado,et al. A Laplacian Framework for Option Discovery in Reinforcement Learning , 2017, ICML.
[30] Stefano Soatto,et al. Emergence of invariance and disentangling in deep representations , 2017 .
[31] Alexander A. Alemi,et al. Deep Variational Information Bottleneck , 2017, ICLR.
[32] A. P. Hyper-parameters. Count-Based Exploration with Neural Density Models , 2017 .
[33] Yee Whye Teh,et al. Distral: Robust multitask reinforcement learning , 2017, NIPS.
[34] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[35] Pieter Abbeel,et al. Emergence of Grounded Compositional Language in Multi-Agent Populations , 2017, AAAI.
[36] Sergey Levine,et al. Meta-Reinforcement Learning of Structured Exploration Strategies , 2018, NeurIPS.
[37] Joshua B. Tenenbaum,et al. Learning to Share and Hide Intentions using Information Regularization , 2018, NeurIPS.
[38] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[39] Regina Barzilay,et al. Representation Learning for Grounded Spatial Reasoning , 2017, TACL.
[40] M. Botvinick,et al. Mental labour , 2018, Nature Human Behaviour.
[41] Stefano Soatto,et al. Information Dropout: Learning Optimal Representations Through Noisy Computation , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[42] Marcin Andrychowicz,et al. Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research , 2018, ArXiv.
[43] Sergey Levine,et al. Recall Traces: Backtracking Models for Efficient Reinforcement Learning , 2018, ICLR.
[44] David H. Wolpert,et al. Nonlinear Information Bottleneck , 2017, Entropy.