暂无分享,去创建一个
[1] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[2] Marvin Minsky,et al. Steps toward Artificial Intelligence , 1995, Proceedings of the IRE.
[3] Razvan Pascanu,et al. Learning to Navigate in Complex Environments , 2016, ICLR.
[4] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[5] Amnon Shashua,et al. Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving , 2016, ArXiv.
[6] Balaraman Ravindran,et al. Dynamic Action Repetition for Deep Reinforcement Learning , 2017, AAAI.
[7] Sean Luke,et al. Cooperative Multi-Agent Learning: The State of the Art , 2005, Autonomous Agents and Multi-Agent Systems.
[8] Andrew J. Davison,et al. Sim-to-Real Reinforcement Learning for Deformable Object Manipulation , 2018, CoRL.
[9] Matthew E. Taylor,et al. Using Monte Carlo Tree Search as a Demonstrator within Asynchronous Deep RL , 2018, ArXiv.
[10] Chao Gao,et al. Skynet: A Top Deep RL Agent in the Inaugural Pommerman Team Competition , 2019, ArXiv.
[11] Matthew E. Taylor,et al. Agent Modeling as Auxiliary Task for Deep Reinforcement Learning , 2019, AIIDE.
[12] Frans A. Oliehoek,et al. A Concise Introduction to Decentralized POMDPs , 2016, SpringerBriefs in Intelligent Systems.
[13] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[14] T. Das,et al. A Reinforcement Learning Model to Assess Market Power Under Auction-Based Energy Pricing , 2007, IEEE Transactions on Power Systems.
[15] Takayuki Osogami,et al. Real-time tree search with pessimistic scenarios , 2019, ArXiv.
[16] Julian Togelius,et al. A hybrid search agent in pommerman , 2018, FDG.
[17] Yang Yu,et al. Towards Sample Efficient Reinforcement Learning , 2018, IJCAI.
[18] Simon M. Lucas,et al. Analysis of Statistical Forward Planning Methods in Pommerman , 2019, AIIDE.
[19] Dhruv Shah,et al. Multi-Agent Strategies for Pommerman , 2018 .
[20] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[21] Joan Bruna,et al. Backplay: "Man muss immer umkehren" , 2018, ArXiv.
[22] Chao Gao,et al. Continual Match Based Training in Pommerman: Technical Report , 2018, ArXiv.
[23] Laurent Jeanpierre,et al. Coordinated Multi-Robot Exploration Under Communication Constraints Using Decentralized Markov Decision Processes , 2012, AAAI.
[24] Balaraman Ravindran,et al. Learning to Repeat: Fine Grained Action Repetition for Deep Reinforcement Learning , 2017, ICLR.
[25] Julian Togelius,et al. Pommerman: A Multi-Agent Playground , 2018, AIIDE Workshops.
[26] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[27] Matthew E. Taylor,et al. Safer Deep RL with Shallow MCTS: A Case Study in Pommerman , 2019, ArXiv.
[28] Daniel Kudenko,et al. Deep Multi-Agent Reinforcement Learning with Relevance Graphs , 2018, ArXiv.