暂无分享,去创建一个
Joan Bruna | Kyunghyun Cho | Alex Peysakhovich | Sanyam Kapoor | Roberta Raileanu | Cinjon Resnick | Kyunghyun Cho | Joan Bruna | Sanyam Kapoor | Cinjon Resnick | R. Raileanu | Alex Peysakhovich
[1] G. G. Stokes. "J." , 1890, The New Yale Book of Quotations.
[2] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[3] John Langford,et al. Approximately Optimal Approximate Reinforcement Learning , 2002, ICML.
[4] P. Bó. Cooperation under the Shadow of the Future: Experimental Evidence from Infinitely Repeated Games , 2005 .
[5] Raymond J. Dolan,et al. Game Theory of Mind , 2008, PLoS Comput. Biol..
[6] John Langford,et al. Search-based structured prediction , 2009, Machine Learning.
[7] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[8] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[9] Simon M. Lucas,et al. A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.
[10] Guan-Yu Chen,et al. On the mixing time and spectral gap for birth and death chains , 2013, 1304.4346.
[11] W. Hong,et al. A note on the passage time of finite-state Markov chains , 2013, 1302.5987.
[12] Joshua B. Tenenbaum,et al. Coordinate to cooperate or compete: Abstract goals and joint intentions in social interaction , 2016, CogSci.
[13] Jianfeng Gao,et al. Deep Reinforcement Learning for Dialogue Generation , 2016, EMNLP.
[14] Marc'Aurelio Ranzato,et al. Sequence Level Training with Recurrent Neural Networks , 2015, ICLR.
[15] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[16] Traian Rebedea,et al. Playing Atari Games with Deep Reinforcement Learning and Human Checkpoint Replay , 2016, ArXiv.
[17] Anca D. Dragan,et al. SHIV: Reducing supervisor burden in DAgger using support vectors for efficient learning from demonstrations in high dimensional state spaces , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).
[18] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[19] Kyunghyun Cho,et al. Query-Efficient Imitation Learning for End-to-End Autonomous Driving , 2016, ArXiv.
[20] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[21] Kevin Waugh,et al. DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker , 2017, ArXiv.
[22] Demis Hassabis,et al. Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm , 2017, ArXiv.
[23] Spyridon Samothrakis,et al. On Monte Carlo Tree Search and Reinforcement Learning , 2017, J. Artif. Intell. Res..
[24] Alexander Peysakhovich,et al. Maintaining cooperation in complex social dilemmas using deep reinforcement learning , 2017, ArXiv.
[25] Pieter Abbeel,et al. Reverse Curriculum Generation for Reinforcement Learning , 2017, CoRL.
[26] Tuomas Sandholm,et al. Safe and Nested Subgame Solving for Imperfect-Information Games , 2017, NIPS.
[27] David Silver,et al. A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning , 2017, NIPS.
[28] Stefan Lee,et al. Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[29] Joel Z. Leibo,et al. Multi-agent Reinforcement Learning in Sequential Social Dilemmas , 2017, AAMAS.
[30] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[31] Sergey Levine,et al. DeepMimic , 2018, ACM Trans. Graph..
[32] Shimon Whiteson,et al. Learning with Opponent-Learning Awareness , 2017, AAMAS.
[33] Julian Togelius,et al. Pommerman: A Multi-Agent Playground , 2018, AIIDE Workshops.
[34] Alexander Peysakhovich,et al. Learning Social Conventions in Markov Games , 2018, ArXiv.
[35] Alexander Peysakhovich,et al. Consequentialist conditional cooperation in social dilemmas with imperfect information , 2017, AAAI Workshops.
[36] Tim Salimans,et al. Learning Montezuma's Revenge from a Single Demonstration , 2018, ArXiv.
[37] Nando de Freitas,et al. Playing hard exploration games by watching YouTube , 2018, NeurIPS.
[38] Alexander Peysakhovich,et al. Learning Existing Social Conventions in Markov Games , 2018, 1806.10071.
[39] Pierre Baldi,et al. Solving the Rubik's Cube Without Human Knowledge , 2018, ArXiv.
[40] Marcin Andrychowicz,et al. Overcoming Exploration in Reinforcement Learning with Demonstrations , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[41] Yang Gao,et al. Reinforcement Learning from Imperfect Demonstrations , 2018, ICLR.
[42] Nando de Freitas,et al. Reinforcement and Imitation Learning for Diverse Visuomotor Skills , 2018, Robotics: Science and Systems.
[43] Julian Togelius,et al. A hybrid search agent in pommerman , 2018, FDG.
[44] Tom Schaul,et al. Deep Q-learning From Demonstrations , 2017, AAAI.
[45] Stefan Lee,et al. Embodied Question Answering , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[46] Ashley D. Edwards,et al. Forward-Backward Reinforcement Learning , 2018, ArXiv.
[47] Andrew Zisserman,et al. Kickstarting Deep Reinforcement Learning , 2018, ArXiv.
[48] Xi Chen,et al. Learning From Demonstration in the Wild , 2018, 2019 International Conference on Robotics and Automation (ICRA).
[49] Sergey Levine,et al. Recall Traces: Backtracking Models for Efficient Reinforcement Learning , 2018, ICLR.
[50] Mo Chen,et al. BaRC: Backward Reachability Curriculum for Robotic Reinforcement Learning , 2018, 2019 International Conference on Robotics and Automation (ICRA).
[51] Tsuyoshi Murata,et al. {m , 1934, ACML.