暂无分享,去创建一个
Patrick MacAlpine | Matthew J. Hausknecht | Jonas Kubilius | Andrey Kolobov | Adrien Gaidon | John Schulman | Matthew Hausknecht | Sahika Genc | Karl Cobbe | Christopher Hesse | Jacob Hilton | Xiaocheng Tang | Blake Wulfe | William H. Guss | Sharada Mohanty | Dipam Chakraborty | Jyotish Poonganam | Thomas Tumiel | William Hebgen Guss | J. Schulman | M. Hausknecht | Christopher Hesse | Karl Cobbe | Jacob Hilton | Adrien Gaidon | A. Kolobov | S. Mohanty | Xiaocheng Tang | Patrick MacAlpine | J. Kubilius | Sahika Genc | Jyotish Poonganam | Blake Wulfe | Dipam Chakraborty | Gravzvydas vSemetulskis | João Schapke | Jurgis Pavsukonis | Linas Klimas | Quang Nhat Tran | Thomas Tumiel | Xinwei Chen | Gravzvydas vSemetulskis | Joao Schapke | Jurgis Pavsukonis | Linas Klimas | Xinwei Chen | Xinwei Chen
[1] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[2] Sergey Levine,et al. Learning to Run challenge: Synthesizing physiologically accurate motion using deep reinforcement learning , 2018, ArXiv.
[3] Marc G. Bellemare,et al. The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning , 2017, ICLR.
[4] Katja Hofmann,et al. The MineRL Competition on Sample Efficient Reinforcement Learning using Human Priors , 2019, ArXiv.
[5] Pieter Abbeel,et al. CURL: Contrastive Unsupervised Representations for Reinforcement Learning , 2020, ICML.
[6] Matthew J. Hausknecht,et al. Working Memory Graphs , 2019, ICML.
[7] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[8] Marc G. Bellemare,et al. DeepMDP: Learning Continuous Latent Space Models for Representation Learning , 2019, ICML.
[9] Richard Zhang,et al. Making Convolutional Networks Shift-Invariant Again , 2019, ICML.
[10] J. Schulman,et al. Leveraging Procedural Generation to Benchmark Reinforcement Learning , 2019, ICML.
[11] John Schulman,et al. Gotta Learn Fast: A New Benchmark for Generalization in RL , 2018, ArXiv.
[12] In-So Kweon,et al. CBAM: Convolutional Block Attention Module , 2018, ECCV.
[13] Ruslan Salakhutdinov,et al. MineRL: A Large-Scale Dataset of Minecraft Demonstrations , 2019, IJCAI.
[14] Pieter Abbeel,et al. Reinforcement Learning with Augmented Data , 2020, NeurIPS.
[15] Rémi Munos,et al. Implicit Quantile Networks for Distributional Reinforcement Learning , 2018, ICML.
[16] Enhua Wu,et al. Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[17] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[18] John Schulman,et al. Phasic Policy Gradient , 2020, ICML.
[19] Julian Togelius,et al. Cellular automata for real-time generation of infinite cave levels , 2010, PCGames@FDG.
[20] Alexander J. Smola,et al. P3O: Policy-on Policy-off Policy Optimization , 2019, UAI.
[21] Katja Hofmann,et al. The Malmo Platform for Artificial Intelligence Experimentation , 2016, IJCAI.
[22] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[23] Nicholay Topin,et al. Super-convergence: very fast training of neural networks using large learning rates , 2018, Defense + Commercial Sensing.
[24] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[25] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.
[26] Amos J. Storkey,et al. Exploration by Random Network Distillation , 2018, ICLR.
[27] Shane Legg,et al. Noisy Networks for Exploration , 2017, ICLR.
[28] Tom Mitchell,et al. Jelly Bean World: A Testbed for Never-Ending Learning , 2020, ICLR.