Hyperparameter Selection for Imitation Learning
暂无分享,去创建一个
Marcin Andrychowicz | Matthieu Geist | Olivier Pietquin | Olivier Bachem | Sertan Girgin | Nikola Momchev | Lukasz Stafiniak | Robert Dadashi | Anton Raichuk | Manu Orsini | Damien Vincent | Leonard Hussenot | Raphael Marinier | Sabela Ramos | Marcin Andrychowicz | Robert Dadashi | Damien Vincent | O. Pietquin | M. Geist | Olivier Bachem | Anton Raichuk | Raphaël Marinier | L'eonard Hussenot | Sertan Girgin | Manu Orsini | Lukasz Stafiniak | Nikola Momchev | Sabela Ramos
[1] Marcin Andrychowicz,et al. What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study , 2020, ArXiv.
[2] Philip Bachman,et al. Deep Reinforcement Learning that Matters , 2017, AAAI.
[3] Max Jaderberg,et al. Population Based Training of Neural Networks , 2017, ArXiv.
[4] Matthieu Geist,et al. Primal Wasserstein Imitation Learning , 2020, ICLR.
[5] Sergey Levine,et al. Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization , 2016, ICML.
[6] Sergio Gomez Colmenarejo,et al. RL Unplugged: Benchmarks for Offline Reinforcement Learning , 2020, ArXiv.
[7] Martin A. Riedmiller,et al. Batch Reinforcement Learning , 2012, Reinforcement Learning.
[8] Robert Feldt,et al. Generating diverse software versions with genetic programming: and experimental study , 1998, IEE Proc. Softw..
[9] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[10] Vikash Kumar,et al. Manipulators and Manipulation in high dimensional spaces , 2016 .
[11] Gerald Tesauro,et al. Temporal Difference Learning and TD-Gammon , 1995, J. Int. Comput. Games Assoc..
[12] Sergio Gomez Colmenarejo,et al. Acme: A Research Framework for Distributed Reinforcement Learning , 2020, ArXiv.
[13] Karl Sims,et al. Evolving virtual creatures , 1994, SIGGRAPH.
[14] Sergey Levine,et al. D4RL: Datasets for Deep Data-Driven Reinforcement Learning , 2020, ArXiv.
[15] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..
[16] Matthieu Geist,et al. Boosted Bellman Residual Minimization Handling Expert Demonstrations , 2014, ECML/PKDD.
[17] Tom Schaul,et al. StarCraft II: A New Challenge for Reinforcement Learning , 2017, ArXiv.
[18] Yuval Tassa,et al. Data-efficient Deep Reinforcement Learning for Dexterous Manipulation , 2017, ArXiv.
[19] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[20] C. Villani. Optimal Transport: Old and New , 2008 .
[21] Larry Rudolph,et al. Implementation Matters in Deep Policy Gradients: A Case Study on PPO and TRPO , 2020, ArXiv.
[22] Klaus-Robert Müller,et al. Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.
[23] Stefan Schaal,et al. Is imitation learning the route to humanoid robots? , 1999, Trends in Cognitive Sciences.
[24] Pierre Geurts,et al. Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..
[25] Richard Zemel,et al. A Divergence Minimization Perspective on Imitation Learning Methods , 2019, CoRL.
[26] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[27] Amos J. Storkey,et al. Exploration by Random Network Distillation , 2018, ICLR.
[28] Henry Zhu,et al. Soft Actor-Critic Algorithms and Applications , 2018, ArXiv.
[29] Yiannis Demiris,et al. Random Expert Distillation: Imitation Learning via Expert Policy Support Estimation , 2019, ICML.
[30] Jakub W. Pachocki,et al. Dota 2 with Large Scale Deep Reinforcement Learning , 2019, ArXiv.
[31] Tom Schaul,et al. Deep Q-learning From Demonstrations , 2017, AAAI.
[32] Joelle Pineau,et al. Learning from Limited Demonstrations , 2013, NIPS.
[33] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[34] Wojciech M. Czarnecki,et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.
[35] Dean Pomerleau,et al. Efficient Training of Artificial Neural Networks for Autonomous Navigation , 1991, Neural Computation.
[36] Yoshua Bengio,et al. An empirical evaluation of deep architectures on problems with many factors of variation , 2007, ICML '07.
[37] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[38] Honglak Lee,et al. Predictive Information Accelerates Learning in RL , 2020, NeurIPS.
[39] Kenneth O. Stanley,et al. First return then explore , 2021, Nature.
[40] S. Levine,et al. Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems , 2020, ArXiv.
[41] Siddhartha Srinivasa,et al. Imitation Learning as f-Divergence Minimization , 2019, WAFR.
[42] Martin A. Riedmiller. Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.
[43] Matthieu Geist,et al. Learning from Demonstrations: Is It Worth Estimating a Reward Function? , 2013, ECML/PKDD.
[44] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[45] Nando de Freitas,et al. Hyperparameter Selection for Offline Reinforcement Learning , 2020, ArXiv.
[46] J. Schulman,et al. Leveraging Procedural Generation to Benchmark Reinforcement Learning , 2019, ICML.
[47] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[48] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[49] Pieter Abbeel,et al. Decoupling Representation Learning from Reinforcement Learning , 2020, ICML.
[50] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..