暂无分享,去创建一个
Chenjia Bai | Zhaoran Wang | Jianye Hao | Peng Liu | Lingxiao Wang | Animesh Garg | Lei Han | Jianye Hao | Lingxiao Wang | Zhaoran Wang | Animesh Garg | Chenjia Bai | Peng Liu | Lei Han | Peng Liu
[1] Tianpei Yang,et al. Exploration in Deep Reinforcement Learning: A Comprehensive Survey , 2021, ArXiv.
[2] Pieter Abbeel,et al. APS: Active Pretraining with Successor Features , 2021, ICML.
[3] Jianye Hao,et al. Principled Exploration via Optimistic Bootstrapping and Backward Induction , 2021, ICML.
[4] P. Abbeel,et al. Behavior From the Void: Unsupervised Active Pre-Training , 2021, NeurIPS.
[5] Alessandro Lazaric,et al. Reinforcement Learning with Prototypical Representations , 2021, ICML.
[6] Jinwoo Shin,et al. State Entropy Maximization with Random Encoders for Efficient Exploration , 2021, ICML.
[7] Lingxiao Wang,et al. Variational Dynamic for Self-Supervised Exploration in Deep Reinforcement Learning , 2020, IEEE Transactions on Neural Networks and Learning Systems.
[8] Aaron C. Courville,et al. Data-Efficient Reinforcement Learning with Self-Predictive Representations , 2020, ICLR.
[9] Zhuoran Yang,et al. Provably Efficient Causal Reinforcement Learning with Confounded Observational Data , 2020, NeurIPS.
[10] R. Fergus,et al. Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels , 2020, ICLR.
[11] Kurt Keutzer,et al. BeBold: Exploration Beyond the Boundary of Explored Regions , 2020, ArXiv.
[12] Joelle Pineau,et al. Novelty Search in representational space for sample efficient exploration , 2020, NeurIPS.
[13] Pierre H. Richemond,et al. Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning , 2020, NeurIPS.
[14] R Devon Hjelm,et al. Deep Reinforcement and InfoMax Learning , 2020, NeurIPS.
[15] P. Abbeel,et al. Reinforcement Learning with Augmented Data , 2020, NeurIPS.
[16] Marlos C. Machado,et al. On Bonus Based Exploration Methods In The Arcade Learning Environment , 2020, ICLR.
[17] Pieter Abbeel,et al. CURL: Contrastive Unsupervised Representations for Reinforcement Learning , 2020, ICML.
[18] Kaiming He,et al. Improved Baselines with Momentum Contrastive Learning , 2020, ArXiv.
[19] Shimon Whiteson,et al. Optimistic Exploration even with a Pessimistic Initialisation , 2020, ICLR.
[20] Daniel Guo,et al. Never Give Up: Learning Directed Exploration Strategies , 2020, ICLR.
[21] Geoffrey E. Hinton,et al. A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.
[22] Chi Jin,et al. Provably Efficient Exploration in Policy Optimization , 2019, ICML.
[23] Wei Xu,et al. Implicit Generative Modeling for Efficient Exploration , 2019, ICML.
[24] Ross B. Girshick,et al. Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[25] Michael I. Jordan,et al. Provably Efficient Reinforcement Learning with Linear Function Approximation , 2019, COLT.
[26] David Warde-Farley,et al. Fast Task Inference with Variational Intrinsic Successor Features , 2019, ICLR.
[27] Ji-Hoon Kim,et al. Curiosity-Bottleneck: Exploration By Distilling Task-Specific Novelty , 2019, ICML.
[28] Deepak Pathak,et al. Self-Supervised Exploration via Disagreement , 2019, ICML.
[29] Sergey Levine,et al. EMI: Exploration with Mutual Information , 2018, ICML.
[30] Honglak Lee,et al. Contingency-Aware Exploration in Reinforcement Learning , 2018, ICLR.
[31] Amos J. Storkey,et al. Exploration by Random Network Distillation , 2018, ICLR.
[32] Alexei A. Efros,et al. Large-Scale Study of Curiosity-Driven Learning , 2018, ICLR.
[33] Marta Z. Kwiatkowska,et al. Evaluating Uncertainty Quantification in End-to-End Autonomous Driving Control , 2018, ArXiv.
[34] Oriol Vinyals,et al. Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.
[35] Michael I. Jordan,et al. Is Q-learning Provably Efficient? , 2018, NeurIPS.
[36] Marlos C. Machado,et al. Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents , 2017, J. Artif. Intell. Res..
[37] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[38] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[39] Rémi Munos,et al. Minimax Regret Bounds for Reinforcement Learning , 2017, ICML.
[40] Marc G. Bellemare,et al. Count-Based Exploration with Neural Density Models , 2017, ICML.
[41] Naftali Tishby,et al. Opening the Black Box of Deep Neural Networks via Information , 2017, ArXiv.
[42] Alexander A. Alemi,et al. Deep Variational Information Bottleneck , 2017, ICLR.
[43] Filip De Turck,et al. #Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning , 2016, NIPS.
[44] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[45] Filip De Turck,et al. VIME: Variational Information Maximizing Exploration , 2016, NIPS.
[46] Benjamin Van Roy,et al. Deep Exploration via Bootstrapped DQN , 2016, NIPS.
[47] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[48] Christoph Dann,et al. Sample Complexity of Episodic Fixed-Horizon Reinforcement Learning , 2015, NIPS.
[49] Naftali Tishby,et al. Deep learning and the information bottleneck principle , 2015, 2015 IEEE Information Theory Workshop (ITW).
[50] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[51] Pierre-Yves Oudeyer,et al. Exploration in Model-based Reinforcement Learning by Empirically Estimating Learning Progress , 2012, NIPS.
[52] Csaba Szepesvári,et al. Improved Algorithms for Linear Stochastic Bandits , 2011, NIPS.
[53] Peter Auer,et al. Near-optimal Regret Bounds for Reinforcement Learning , 2008, J. Mach. Learn. Res..
[54] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .
[55] Peter Auer,et al. Logarithmic Online Regret Bounds for Undiscounted Reinforcement Learning , 2006, NIPS.
[56] Jesse Hoey,et al. An analytic solution to discrete Bayesian reinforcement learning , 2006, ICML.
[57] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[58] Peter Auer,et al. Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..
[59] Naftali Tishby,et al. The information bottleneck method , 2000, ArXiv.
[60] Yann LeCun,et al. Signature Verification Using A "Siamese" Time Delay Neural Network , 1993, Int. J. Pattern Recognit. Artif. Intell..
[61] M. West. Outlier Models and Prior Distributions in Bayesian Linear Regression , 1984 .