暂无分享,去创建一个
Satinder Singh | Honglak Lee | Junhyuk Oh | Yijie Guo | Junhyuk Oh | Satinder Singh | Honglak Lee | Yijie Guo
[1] Soumith Chintala,et al. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.
[2] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[3] Sergey Levine,et al. Reinforcement Learning with Deep Energy-Based Policies , 2017, ICML.
[4] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[5] Michael H. Bowling,et al. Apprenticeship learning using linear programming , 2008, ICML '08.
[6] Shie Mannor,et al. End-to-End Differentiable Adversarial Imitation Learning , 2017, ICML.
[7] Quoc V. Le,et al. Neural Program Synthesis with Priority Queue Training , 2018, ArXiv.
[8] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[9] Qiang Liu,et al. Learning Self-Imitating Diverse Policies , 2018, ICLR.
[10] Elman Mansimov,et al. Simple Nearest Neighbor Policy Method for Continuous Control Tasks , 2018 .
[11] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[12] Satinder Singh,et al. Self-Imitation Learning , 2018, ICML.
[13] Richard L. Lewis,et al. Where Do Rewards Come From , 2009 .
[14] Bernt Schiele,et al. Generative Adversarial Text to Image Synthesis , 2016, ICML.
[15] Honglak Lee,et al. Deep Learning for Reward Design to Improve Monte Carlo Tree Search in ATARI Games , 2016, IJCAI.
[16] Peter Dayan,et al. Hippocampal Contributions to Control: The Third Way , 2007, NIPS.
[17] Sergey Levine,et al. Recall Traces: Backtracking Models for Efficient Reinforcement Learning , 2018, ICLR.
[18] Richard L. Lewis,et al. Reward Design via Online Gradient Ascent , 2010, NIPS.
[19] Demis Hassabis,et al. Neural Episodic Control , 2017, ICML.
[20] Joel Z. Leibo,et al. Model-Free Episodic Control , 2016, ArXiv.
[21] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[22] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[23] 拓海 杉山,et al. “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .
[24] Chen Liang,et al. Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision , 2016, ACL.
[25] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[26] Satinder Singh,et al. On Learning Intrinsic Rewards for Policy Gradient Methods , 2018, NeurIPS.
[27] Gaurav S. Sukhatme,et al. Multi-Modal Imitation Learning from Unstructured Demonstrations using Generative Adversarial Nets , 2017, NIPS.
[28] Stefano Ermon,et al. InfoGAIL: Interpretable Imitation Learning from Visual Demonstrations , 2017, NIPS.