暂无分享,去创建一个
Matthieu Geist | Olivier Pietquin | Robert Dadashi | L'eonard Hussenot | Robert Dadashi | O. Pietquin | M. Geist | L'eonard Hussenot
[1] Vikash Kumar,et al. MuJoCo HAPTIX: A virtual reality system for hand manipulation , 2015, 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids).
[2] Marc G. Bellemare,et al. A Distributional Perspective on Reinforcement Learning , 2017, ICML.
[3] Sergio Gomez Colmenarejo,et al. Acme: A Research Framework for Distributed Reinforcement Learning , 2020, ArXiv.
[4] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[5] Gabriel Peyré,et al. Computational Optimal Transport , 2018, Found. Trends Mach. Learn..
[6] Robert Givan,et al. Equivalence notions and model minimization in Markov decision processes , 2003, Artif. Intell..
[7] Michael I. Jordan,et al. Learning to Score Behaviors for Guided Policy Optimization , 2020, ICML.
[8] Peter Stone,et al. Behavioral Cloning from Observation , 2018, IJCAI.
[9] Matthieu Geist,et al. Learning from Demonstrations: Is It Worth Estimating a Reward Function? , 2013, ECML/PKDD.
[10] Gerald Tesauro,et al. Temporal Difference Learning and TD-Gammon , 1995, J. Int. Comput. Games Assoc..
[11] Sergey Levine,et al. DeepMimic , 2018, ACM Trans. Graph..
[12] Anca D. Dragan,et al. SQIL: Imitation Learning via Regularized Behavioral Cloning , 2019, ArXiv.
[13] Jakub W. Pachocki,et al. Learning dexterous in-hand manipulation , 2018, Int. J. Robotics Res..
[14] Mohammed Abdullah,et al. A note on reinforcement learning with Wasserstein distance regularisation, with applications to multipolicy learning , 2018, ArXiv.
[15] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[16] Sergey Levine,et al. Learning Robust Rewards with Adversarial Inverse Reinforcement Learning , 2017, ICLR 2017.
[17] Ana Paiva,et al. Learning from a Learner , 2019, ICML.
[18] Yannick Schroecker,et al. State Aware Imitation Learning , 2017, NIPS.
[19] M. V. D. Panne,et al. Displacement Interpolation Using Lagrangian Mass Transport , 2011 .
[20] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[21] Kee-Eung Kim,et al. Imitation Learning via Kernel Mean Embedding , 2018, AAAI.
[22] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[23] Matthieu Geist,et al. Inverse Reinforcement Learning through Structured Classification , 2012, NIPS.
[24] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.
[25] David M. Bradley,et al. Boosting Structured Prediction for Imitation Learning , 2006, NIPS.
[26] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[27] Dean Pomerleau,et al. Efficient Training of Artificial Neural Networks for Autonomous Navigation , 1991, Neural Computation.
[28] Prabhat Nagarajan,et al. Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement Learning from Observations , 2019, ICML.
[29] Pablo Samuel Castro,et al. Scalable methods for computing state similarity in deterministic Markov Decision Processes , 2019, AAAI.
[30] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[31] Yannick Schroecker,et al. Imitating Latent Policies from Observation , 2018, ICML.
[32] Yiannis Demiris,et al. Random Expert Distillation: Imitation Learning via Expert Policy Support Estimation , 2019, ICML.
[33] Tom Schaul,et al. Deep Q-learning From Demonstrations , 2017, AAAI.
[34] Marc G. Bellemare,et al. DeepMDP: Learning Continuous Latent Space Models for Representation Learning , 2019, ICML.
[35] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[36] Jonathan Tompson,et al. Temporal Cycle-Consistency Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[37] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[38] Huang Xiao,et al. Wasserstein Adversarial Imitation Learning , 2019, ArXiv.
[39] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[40] Matthew W. Hoffman,et al. Distributed Distributional Deterministic Policy Gradients , 2018, ICLR.
[41] Hao Su,et al. State Alignment-based Imitation Learning , 2019, ICLR.
[42] Léon Bottou,et al. Wasserstein Generative Adversarial Networks , 2017, ICML.
[43] Matthieu Geist,et al. A Cascaded Supervised Learning Approach to Inverse Reinforcement Learning , 2013, ECML/PKDD.
[44] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[45] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[46] Sergey Levine,et al. Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations , 2017, Robotics: Science and Systems.
[47] Sepp Hochreiter,et al. Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.
[48] Stefano Ermon,et al. InfoGAIL: Interpretable Imitation Learning from Visual Demonstrations , 2017, NIPS.
[49] Mikael Henaff,et al. Disagreement-Regularized Imitation Learning , 2020, ICLR.
[50] Jan Peters,et al. Relative Entropy Inverse Reinforcement Learning , 2011, AISTATS.
[51] Richard Zemel,et al. A Divergence Minimization Perspective on Imitation Learning Methods , 2019, CoRL.
[52] Nando de Freitas,et al. Playing hard exploration games by watching YouTube , 2018, NeurIPS.
[53] Siddhartha Srinivasa,et al. Imitation Learning as f-Divergence Minimization , 2019, WAFR.
[54] Matthieu Geist,et al. Boosted and reward-regularized classification for apprenticeship learning , 2014, AAMAS.
[55] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[56] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[57] Amos J. Storkey,et al. Exploration by Random Network Distillation , 2018, ICLR.
[58] Henry Zhu,et al. Soft Actor-Critic Algorithms and Applications , 2018, ArXiv.
[59] Marco Pavone,et al. Risk-Sensitive Generative Adversarial Imitation Learning , 2018, AISTATS.
[60] J. Andrew Bagnell,et al. Efficient Reductions for Imitation Learning , 2010, AISTATS.
[61] Matthieu Geist,et al. Bridging the Gap Between Imitation Learning and Inverse Reinforcement Learning , 2017, IEEE Transactions on Neural Networks and Learning Systems.
[62] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[63] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[64] C. Villani. Optimal Transport: Old and New , 2008 .
[65] Doina Precup,et al. Metrics for Finite Markov Decision Processes , 2004, AAAI.
[66] Sergey Levine,et al. Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization , 2016, ICML.
[67] Manuel Lopes,et al. Learning from Demonstration Using MDP Induced Metrics , 2010, ECML/PKDD.
[68] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.
[69] Sridhar Mahadevan,et al. Proto-value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes , 2007, J. Mach. Learn. Res..