暂无分享,去创建一个
Jimmy Ba | Bradly C. Stadie | Silviu Pitis | Harris Chan | Bradly Stadie | Stephen Zhao | Jimmy Ba | Silviu Pitis | Harris Chan | S. Zhao
[1] Chrystopher L. Nehaniv,et al. Empowerment: a universal agent-centric measure of control , 2005, 2005 IEEE Congress on Evolutionary Computation.
[2] Pierre-Yves Oudeyer,et al. Active learning of inverse models with intrinsically motivated goal exploration in robots , 2013, Robotics Auton. Syst..
[3] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..
[4] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.
[5] Daan Wierstra,et al. Variational Intrinsic Control , 2016, ICLR.
[6] Thomas M. Cover,et al. Elements of Information Theory , 2005 .
[7] M. Rosenblatt. Remarks on Some Nonparametric Estimates of a Density Function , 1956 .
[8] Pierre-Yves Oudeyer,et al. CURIOUS: Intrinsically Motivated Modular Multi-Goal Reinforcement Learning , 2018, ICML.
[9] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[10] Patrick M. Pilarski,et al. Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction , 2011, AAMAS.
[11] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[12] Sergey Levine,et al. Skew-Fit: State-Covering Self-Supervised Reinforcement Learning , 2019, ICML.
[13] Animesh Garg,et al. LEAF: Latent Exploration Along the Frontier , 2020 .
[14] Javier García,et al. A comprehensive survey on safe reinforcement learning , 2015, J. Mach. Learn. Res..
[15] Benjamin Van Roy,et al. Generalization and Exploration via Randomized Value Functions , 2014, ICML.
[16] David Warde-Farley,et al. Unsupervised Control Through Non-Parametric Discriminative Rewards , 2018, ICLR.
[17] Harm van Seijen,et al. Using a Logarithmic Mapping to Enable Lower Discount Factors in Reinforcement Learning , 2019, NeurIPS.
[18] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[19] Samy Bengio,et al. Density estimation using Real NVP , 2016, ICLR.
[20] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[21] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[22] Sham M. Kakade,et al. Provably Efficient Maximum Entropy Exploration , 2018, ICML.
[23] Shakir Mohamed,et al. Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning , 2015, NIPS.
[24] Kevin Gimpel,et al. Gaussian Error Linear Units (GELUs) , 2016 .
[25] Eric Nalisnick,et al. Normalizing Flows for Probabilistic Modeling and Inference , 2019, J. Mach. Learn. Res..
[26] Wojciech Jaskowski,et al. Model-Based Active Exploration , 2018, ICML.
[27] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[28] Kenneth O. Stanley,et al. Go-Explore: a New Approach for Hard-Exploration Problems , 2019, ArXiv.
[29] Christoph Salge,et al. Empowerment - an Introduction , 2013, ArXiv.
[30] Philip S. Thomas,et al. High-Confidence Off-Policy Evaluation , 2015, AAAI.
[31] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[32] Andrew Y. Ng,et al. Near-Bayesian exploration in polynomial time , 2009, ICML '09.
[33] Filip De Turck,et al. #Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning , 2016, NIPS.
[34] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[35] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[36] Pieter Abbeel,et al. Automatic Goal Generation for Reinforcement Learning Agents , 2017, ICML.
[37] Ruslan Salakhutdinov,et al. Weakly-Supervised Reinforcement Learning for Controllable Behavior , 2020, NeurIPS.
[38] Daniel Guo,et al. Agent57: Outperforming the Atari Human Benchmark , 2020, ICML.
[39] Pieter Abbeel,et al. Automatic Curriculum Learning through Value Disagreement , 2020, NeurIPS.
[40] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[41] Amos J. Storkey,et al. Exploration by Random Network Distillation , 2018, ICLR.
[42] Richard Socher,et al. Keeping Your Distance: Solving Sparse Reward Tasks Using Self-Balancing Shaped Rewards , 2019, NeurIPS.
[43] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[44] Brian Yamauchi,et al. A frontier-based approach for autonomous exploration , 1997, Proceedings 1997 IEEE International Symposium on Computational Intelligence in Robotics and Automation CIRA'97. 'Towards New Computational Principles for Robotics and Automation'.
[45] Marc G. Bellemare,et al. Count-Based Exploration with Neural Density Models , 2017, ICML.
[46] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[47] Joshua B. Tenenbaum,et al. Meta-Learning for Semi-Supervised Few-Shot Classification , 2018, ICLR.
[48] Sven Behnke,et al. Evaluating the Efficiency of Frontier-based Exploration Strategies , 2010, ISR/ROBOTIK.
[49] Zheng Wen,et al. Deep Exploration via Randomized Value Functions , 2017, J. Mach. Learn. Res..
[50] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[51] Marcin Andrychowicz,et al. Parameter Space Noise for Exploration , 2017, ICLR.
[52] Rui Zhao,et al. Maximum Entropy-Regularized Multi-Goal Reinforcement Learning , 2019, ICML.
[53] Philip Wolfe,et al. An algorithm for quadratic programming , 1956 .
[54] Sergey Levine,et al. Diversity is All You Need: Learning Skills without a Reward Function , 2018, ICLR.
[55] Marcin Andrychowicz,et al. Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research , 2018, ArXiv.
[56] Sergey Levine,et al. Visual Reinforcement Learning with Imagined Goals , 2018, NeurIPS.
[57] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[58] Pierre-Yves Oudeyer,et al. Exploration in Model-based Reinforcement Learning by Empirically Estimating Learning Progress , 2012, NIPS.
[59] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[60] Sergey Levine,et al. Dynamical Distance Learning for Semi-Supervised and Unsupervised Skill Discovery , 2020, ICLR.
[61] Jürgen Schmidhuber,et al. Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990–2010) , 2010, IEEE Transactions on Autonomous Mental Development.
[62] Leslie Pack Kaelbling,et al. Learning to Achieve Goals , 1993, IJCAI.
[63] Lei Han,et al. Curriculum-guided Hindsight Experience Replay , 2019, NeurIPS.
[64] Sergey Levine,et al. Efficient Exploration via State Marginal Matching , 2019, ArXiv.
[65] Pierre-Yves Oudeyer,et al. Intrinsically motivated goal exploration for active motor learning in robots: A case study , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[66] Tom Lenaerts,et al. Dynamic Weights in Multi-Objective Deep Reinforcement Learning , 2018, ICML.
[67] David Warde-Farley,et al. Fast Task Inference with Variational Intrinsic Successor Features , 2019, ICLR.
[68] Marcin Andrychowicz,et al. Overcoming Exploration in Reinforcement Learning with Demonstrations , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[69] Pierre-Yves Oudeyer,et al. CURIOUS: Intrinsically Motivated Multi-Task, Multi-Goal Reinforcement Learning , 2018, ICML 2019.