Learning to Score Behaviors for Guided Policy Optimization
暂无分享,去创建一个
Michael I. Jordan | Krzysztof Choromanski | Anna Choromanska | Yunhao Tang | Aldo Pacchiano | Jack Parker-Holder
[1] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[2] Lawrence Carin,et al. Policy Optimization as Wasserstein Gradient Flows , 2018, ICML.
[3] Eduardo F. Morales,et al. An Introduction to Reinforcement Learning , 2011 .
[4] Benjamin Recht,et al. Simple random search of static linear policies is competitive for reinforcement learning , 2018, NeurIPS.
[5] Shane Legg,et al. Noisy Networks for Exploration , 2017, ICLR.
[6] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.
[7] Brendan Maginnis,et al. On Wasserstein Reinforcement Learning and the Fokker-Planck equation , 2017, ArXiv.
[8] Atil Iscen,et al. Provably Robust Blackbox Optimization for Reinforcement Learning , 2019, CoRL.
[9] Shie Mannor,et al. Nonlinear Distributional Gradient Temporal-Difference Learning , 2018, ICML.
[10] Rémi Munos,et al. Implicit Quantile Networks for Distributional Reinforcement Learning , 2018, ICML.
[11] Yuesheng Xu,et al. Universal Kernels , 2006, J. Mach. Learn. Res..
[12] Marc G. Bellemare,et al. A Distributional Perspective on Reinforcement Learning , 2017, ICML.
[13] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[14] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.
[15] Marc G. Bellemare,et al. Statistics and Samples in Distributional Reinforcement Learning , 2019, ICML.
[16] Kenneth O. Stanley,et al. Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents , 2017, NeurIPS.
[17] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[18] Z. Popovic,et al. Learning behavior styles with inverse reinforcement learning , 2010, ACM Trans. Graph..
[19] Yee Whye Teh,et al. An Analysis of Categorical Distributional Reinforcement Learning , 2018, AISTATS.
[20] C. Villani. Optimal Transport: Old and New , 2008 .
[21] Léon Bottou,et al. Wasserstein Generative Adversarial Networks , 2017, ICML.
[22] J. Lehman. EVOLUTION THROUGH THE SEARCH FOR NOVELTY , 2012 .
[23] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[24] Nicolas Le Roux,et al. Distributional reinforcement learning with linear function approximation , 2019, AISTATS.
[25] John Langford,et al. Approximately Optimal Approximate Reinforcement Learning , 2002, ICML.
[26] Silvia Chiappa,et al. Wasserstein Fair Classification , 2019, UAI.
[27] Benjamin Recht,et al. Random Features for Large-Scale Kernel Machines , 2007, NIPS.
[28] Kenneth O. Stanley,et al. Evolvability ES: scalable and direct optimization of evolvability , 2019, GECCO.
[29] Huaiyu Zhu. On Information and Sufficiency , 1997 .
[30] Xi Chen,et al. Evolution Strategies as a Scalable Alternative to Reinforcement Learning , 2017, ArXiv.
[31] Gabriel Peyré,et al. Stochastic Optimization for Large-scale Optimal Transport , 2016, NIPS.
[32] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[33] Danna Zhou,et al. d. , 1934, Microbial pathogenesis.
[34] Kenneth O. Stanley,et al. Quality Diversity: A New Frontier for Evolutionary Computation , 2016, Front. Robot. AI.
[35] R. Miikkulainen,et al. Learning Behavior Characterizations for Novelty Search , 2016, GECCO.
[36] Kenneth O. Stanley,et al. Exploiting Open-Endedness to Solve Problems Through the Search for Novelty , 2008, ALIFE.