暂无分享,去创建一个
Sergey Levine | Aurick Zhou | Ashwin Reddy | Vitchyr H. Pong | Abhishek Gupta | Kevin Li | Justin Yu | S. Levine | Abhishek Gupta | Aurick Zhou | Justin Yu | Ashwin Reddy | Kevin Li
[1] Sergey Levine,et al. Amortized Conditional Normalized Maximum Likelihood , 2020, ArXiv.
[2] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[3] Jorma Rissanen,et al. Fisher information and stochastic complexity , 1996, IEEE Trans. Inf. Theory.
[4] Sergey Levine,et al. Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning , 2019, CoRL.
[5] Satinder Singh,et al. Many-Goals Reinforcement Learning , 2018, ArXiv.
[6] Lorenzo Natale,et al. Learning latent state representation for speeding up exploration , 2019, ArXiv.
[7] Sergey Levine,et al. Dynamical Distance Learning for Unsupervised and Semi-Supervised Skill Discovery , 2019, ArXiv.
[8] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[9] Sergey Levine,et al. Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations , 2017, Robotics: Science and Systems.
[10] Sergey Levine,et al. Learning Robust Rewards with Adversarial Inverse Reinforcement Learning , 2017, ICLR 2017.
[11] Benjamin Van Roy,et al. (More) Efficient Reinforcement Learning via Posterior Sampling , 2013, NIPS.
[12] S. Levine,et al. Learning To Reach Goals Without Reinforcement Learning , 2019, ArXiv.
[13] Pierre-Yves Oudeyer,et al. CURIOUS: Intrinsically Motivated Modular Multi-Goal Reinforcement Learning , 2018, ICML.
[14] Jürgen Schmidhuber,et al. Efficient model-based exploration , 1998 .
[15] Sergey Levine,et al. Variational Inverse Control with Events: A General Framework for Data-Driven Reward Definition , 2018, NeurIPS.
[16] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[17] Jun Zhang. Model Selection with Informative Normalized Maximum Likelihood: Data Prior and Model Prior , 2011 .
[18] Filip De Turck,et al. VIME: Variational Information Maximizing Exploration , 2016, NIPS.
[19] Sergey Levine,et al. Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models , 2015, ArXiv.
[20] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[21] Chong Wang,et al. Stochastic variational inference , 2012, J. Mach. Learn. Res..
[22] Sergey Levine,et al. Skew-Fit: State-Covering Self-Supervised Reinforcement Learning , 2019, ICML.
[23] Jorma Rissanen,et al. Minimum Description Length Principle , 2010, Encyclopedia of Machine Learning.
[24] Amos J. Storkey,et al. Exploration by Random Network Distillation , 2018, ICLR.
[25] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[26] Léon Bottou,et al. Towards Principled Methods for Training Generative Adversarial Networks , 2017, ICLR.
[27] Benjamin Van Roy,et al. Deep Exploration via Bootstrapped DQN , 2016, NIPS.
[28] Sergey Levine,et al. The Ingredients of Real-World Robotic Reinforcement Learning , 2020, ICLR.
[29] David Warde-Farley,et al. Unsupervised Control Through Non-Parametric Discriminative Rewards , 2018, ICLR.
[30] Akshay Krishnamurthy,et al. Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning , 2019, ICML.
[31] Meir Feder,et al. Deep pNML: Predictive Normalized Maximum Likelihood for Deep Neural Networks , 2019, ArXiv.
[32] Malcolm J. A. Strens,et al. A Bayesian Framework for Reinforcement Learning , 2000, ICML.
[33] Filip De Turck,et al. #Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning , 2016, NIPS.
[34] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[35] Filipe Wall Mutz,et al. Hindsight policy gradients , 2017, ICLR.
[36] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[37] Ali Farhadi,et al. Target-driven visual navigation in indoor scenes using deep reinforcement learning , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[38] Leslie Pack Kaelbling,et al. Learning to Achieve Goals , 1993, IJCAI.
[39] Brendan O'Donoghue,et al. Variational Bayesian Reinforcement Learning with Regret Bounds , 2018, NeurIPS.
[40] Sergey Levine,et al. Visual Reinforcement Learning with Imagined Goals , 2018, NeurIPS.
[41] Peter Stone,et al. Interactively shaping agents via human reinforcement: the TAMER framework , 2009, K-CAP '09.
[42] Justin Fu,et al. EX2: Exploration with Exemplar Models for Deep Reinforcement Learning , 2017, NIPS.
[43] Tom Schaul,et al. Curiosity-driven optimization , 2011, 2011 IEEE Congress of Evolutionary Computation (CEC).
[44] Sergey Levine,et al. End-to-End Robotic Reinforcement Learning without Reward Engineering , 2019, Robotics: Science and Systems.
[45] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[46] J. Rissanen,et al. Conditional NML Universal Models , 2007, 2007 Information Theory and Applications Workshop.
[47] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[48] Andrew Gordon Wilson,et al. A Simple Baseline for Bayesian Uncertainty in Deep Learning , 2019, NeurIPS.
[49] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[50] Sergey Levine,et al. Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review , 2018, ArXiv.
[51] Jakub W. Pachocki,et al. Learning dexterous in-hand manipulation , 2018, Int. J. Robotics Res..
[52] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[53] Meir Feder,et al. Universal Batch Learning with Log-Loss , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).
[54] Julien Cornebise,et al. Weight Uncertainty in Neural Network , 2015, ICML.