Behavior-Guided Reinforcement Learning
暂无分享,去创建一个
Michael I. Jordan | Krzysztof Choromanski | Anna Choromanska | Yunhao Tang | Aldo Pacchiano | Jack Parker-Holder
[1] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[2] Yee Whye Teh,et al. An Analysis of Categorical Distributional Reinforcement Learning , 2018, AISTATS.
[3] Yuesheng Xu,et al. Universal Kernels , 2006, J. Mach. Learn. Res..
[4] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.
[5] Marc G. Bellemare,et al. A Distributional Perspective on Reinforcement Learning , 2017, ICML.
[6] Brendan Maginnis,et al. On Wasserstein Reinforcement Learning and the Fokker-Planck equation , 2017, ArXiv.
[7] Marco Cuturi,et al. Sinkhorn Distances: Lightspeed Computation of Optimal Transport , 2013, NIPS.
[8] C. Villani. Optimal Transport: Old and New , 2008 .
[9] Léon Bottou,et al. Wasserstein Generative Adversarial Networks , 2017, ICML.
[10] Xi Chen,et al. Evolution Strategies as a Scalable Alternative to Reinforcement Learning , 2017, ArXiv.
[11] Bernhard Schölkopf,et al. Wasserstein Auto-Encoders , 2017, ICLR.
[12] Marc G. Bellemare,et al. Statistics and Samples in Distributional Reinforcement Learning , 2019, ICML.
[13] Shie Mannor,et al. Nonlinear Distributional Gradient Temporal-Difference Learning , 2018, ICML.
[14] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[15] Richard E. Turner,et al. Geometrically Coupled Monte Carlo Sampling , 2018, NeurIPS.
[16] Yuval Tassa,et al. DeepMind Control Suite , 2018, ArXiv.
[17] Kenneth O. Stanley,et al. Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents , 2017, NeurIPS.
[18] Atil Iscen,et al. Provably Robust Blackbox Optimization for Reinforcement Learning , 2019, CoRL.
[19] Rémi Munos,et al. Implicit Quantile Networks for Distributional Reinforcement Learning , 2018, ICML.
[20] Michalis Vazirgiannis,et al. Matching Node Embeddings for Graph Similarity , 2017, AAAI.
[21] Lawrence Carin,et al. Policy Optimization as Wasserstein Gradient Flows , 2018, ICML.
[22] R. Miikkulainen,et al. Learning Behavior Characterizations for Novelty Search , 2016, GECCO.
[23] John Langford,et al. Approximately Optimal Approximate Reinforcement Learning , 2002, ICML.
[24] Richard E. Turner,et al. Structured Evolution with Compact Architectures for Scalable Policy Optimization , 2018, ICML.
[25] Jürgen Schmidhuber,et al. Recurrent World Models Facilitate Policy Evolution , 2018, NeurIPS.
[26] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[27] Benjamin Recht,et al. Random Features for Large-Scale Kernel Machines , 2007, NIPS.
[28] Matt J. Kusner,et al. From Word Embeddings To Document Distances , 2015, ICML.
[29] Kenneth O. Stanley,et al. Evolvability ES: scalable and direct optimization of evolvability , 2019, GECCO.
[30] Benjamin Recht,et al. Simple random search of static linear policies is competitive for reinforcement learning , 2018, NeurIPS.
[31] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[32] Nicolas Le Roux,et al. Distributional reinforcement learning with linear function approximation , 2019, AISTATS.
[33] Gabriel Peyré,et al. Stochastic Optimization for Large-scale Optimal Transport , 2016, NIPS.
[34] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.
[35] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[36] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[37] Kenneth O. Stanley,et al. Quality Diversity: A New Frontier for Evolutionary Computation , 2016, Front. Robot. AI.
[38] Richard Sinkhorn. A Relationship Between Arbitrary Positive Matrices and Doubly Stochastic Matrices , 1964 .
[39] Zoran Popović,et al. Learning behavior styles with inverse reinforcement learning , 2010, SIGGRAPH 2010.
[40] Silvia Chiappa,et al. Wasserstein Fair Classification , 2019, UAI.
[41] Martin J. Wainwright,et al. Statistical guarantees for the EM algorithm: From population to sample-based analysis , 2014, ArXiv.
[42] Peter Stone,et al. Behavioral Cloning from Observation , 2018, IJCAI.
[43] J. Lehman. EVOLUTION THROUGH THE SEARCH FOR NOVELTY , 2012 .