暂无分享,去创建一个
Wei Xu | Fuxin Li | Qinxun Bai | Neale Ratzlaff | Fuxin Li | W. Xu | Qinxun Bai | Neale Ratzlaff
[1] Marcin Andrychowicz,et al. Parameter Space Noise for Exploration , 2017, ICLR.
[2] Dilin Wang,et al. Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm , 2016, NIPS.
[3] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[4] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .
[5] Luca Ambrogioni,et al. Wasserstein Variational Gradient Descent: From Semi-Discrete Optimal Transport to Ensemble Variational Inference , 2018, ArXiv.
[6] Jürgen Schmidhuber,et al. Curious model-building control systems , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.
[7] K. Chaloner,et al. Bayesian Experimental Design: A Review , 1995 .
[8] Fuxin Li,et al. HyperGAN: A Generative Model for Diverse, Performant Neural Networks , 2019, ICML.
[9] Marc G. Bellemare,et al. Count-Based Exploration with Neural Density Models , 2017, ICML.
[10] Sergey Levine,et al. Diversity is All You Need: Learning Skills without a Reward Function , 2018, ICLR.
[11] Benjamin Van Roy,et al. Deep Exploration via Bootstrapped DQN , 2016, NIPS.
[12] Albin Cassirer,et al. Randomized Prior Functions for Deep Reinforcement Learning , 2018, NeurIPS.
[13] Sebastian Nowozin,et al. Can You Trust Your Model's Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift , 2019, NeurIPS.
[14] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[15] Alexei A. Efros,et al. Large-Scale Study of Curiosity-Driven Learning , 2018, ICLR.
[16] Quoc V. Le,et al. Swish: a Self-Gated Activation Function , 2017, 1710.05941.
[17] Wojciech Jaskowski,et al. Model-Based Active Exploration , 2018, ICML.
[18] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[19] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[20] H. Robbins,et al. Asymptotically efficient adaptive allocation rules , 1985 .
[21] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[22] Dilin Wang,et al. Learning to Draw Samples with Amortized Stein Variational Gradient Descent , 2017, UAI.
[23] Ruben Villegas,et al. Learning Latent Dynamics for Planning from Pixels , 2018, ICML.
[24] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[25] Le Song,et al. Provable Bayesian Inference via Particle Mirror Descent , 2015, AISTATS.
[26] Quoc V. Le,et al. Searching for Activation Functions , 2018, arXiv.
[27] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[28] Sergey Levine,et al. Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models , 2018, NeurIPS.
[29] Shane Legg,et al. Noisy Networks for Exploration , 2017, ICLR.
[30] Jürgen Schmidhuber,et al. Exploring the predictable , 2003 .
[31] Amos J. Storkey,et al. Exploration by Random Network Distillation , 2018, ICLR.
[32] David M. Blei,et al. Variational Inference: A Review for Statisticians , 2016, ArXiv.
[33] Zheng Wen,et al. Deep Exploration via Randomized Value Functions , 2017, J. Mach. Learn. Res..
[34] Christoph Salge,et al. Empowerment - an Introduction , 2013, ArXiv.
[35] Yi Sun,et al. Planning to Be Surprised: Optimal Bayesian Exploration in Dynamic Environments , 2011, AGI.
[36] P. Diaconis,et al. Use of exchangeable pairs in the analysis of simulations , 2004 .
[37] Filip De Turck,et al. VIME: Variational Information Maximizing Exploration , 2016, NIPS.
[38] Daan Wierstra,et al. Variational Intrinsic Control , 2016, ICLR.
[39] Thomas P. Hayes,et al. Stochastic Linear Optimization under Bandit Feedback , 2008, COLT.
[40] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[41] Deepak Pathak,et al. Self-Supervised Exploration via Disagreement , 2019, ICML.
[42] Peter Auer,et al. Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..
[43] Richard E. Turner,et al. Gradient Estimators for Implicit Models , 2017, ICLR.
[44] Shakir Mohamed,et al. Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning , 2015, NIPS.