Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
 Lex Weaver,et al. The Optimal Reward Baseline for Gradient-Based Reinforcement Learning , 2001, UAI.
 Kagan Tumer,et al. Optimal Payoff Functions for Members of Collectives , 2001, Adv. Complex Syst..
 Leslie Pack Kaelbling,et al. All learning is Local: Multi-agent Learning in Global Reward Games , 2003, NIPS.
 Erfu Yang,et al. Multiagent Reinforcement Learning for Multi-Robot Systems: A Survey , 2004 .
 Ronald J. Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning , 2004, Machine Learning.
 Wang Ying,et al. Multi-agent framework for third party logistics in E-commerce , 2005, Expert Syst. Appl..
 Danny Weyns,et al. The Packet-World: A Test Bed for Investigating Situated Multi-Agent Systems , 2005 .
 Richard S. Sutton,et al. Learning to Predict by the Methods of Temporal Differences , 1988, Machine Learning.
 Kagan Tumer,et al. Distributed agent-based air traffic flow management , 2007, AAMAS '07.
 Nikos A. Vlassis,et al. Optimal and Approximate Q-value Functions for Decentralized POMDPs , 2008, J. Artif. Intell. Res..
 Bart De Schutter,et al. A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
 Yoav Shoham,et al. Multiagent Systems - Algorithmic, Game-Theoretic, and Logical Foundations , 2008 .
 Clément Farabet,et al. Torch7: A Matlab-like Environment for Machine Learning , 2011, NIPS 2011.
 Kagan Tumer,et al. Modeling difference rewards for multiagent learning , 2012, AAMAS.
 Wenwu Yu,et al. An Overview of Recent Progress in the Study of Distributed Multi-Agent Coordination , 2012, IEEE Transactions on Industrial Informatics.
 Yoshua Bengio,et al. On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.
 Kevin Leyton-Brown,et al. Empirically Evaluating Multiagent Learning Algorithms , 2014, ArXiv.
 Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.
 Kagan Tumer,et al. Approximating Difference Evaluations with Local Information , 2015, AAMAS.
 Yun Yang,et al. A Multi-Agent Framework for Packet Routing in Wireless Sensor Networks , 2015, Sensors.
 Peter Stone,et al. Deep Recurrent Q-Learning for Partially Observable MDPs , 2015, AAAI Fall Symposia.
 Shimon Whiteson,et al. Learning to Communicate with Deep Multi-Agent Reinforcement Learning , 2016, NIPS.
 Bikramjit Banerjee,et al. Multi-agent reinforcement learning as a rehearsal for decentralized planning , 2016, Neurocomputing.
 Rob Fergus,et al. Learning Multiagent Communication with Backpropagation , 2016, NIPS.
 Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
 Florian Richoux,et al. TorchCraft: a Library for Machine Learning Research on Real-Time Strategy Games , 2016, ArXiv.
 Emil Gustavsson,et al. Learning to Play Guess Who? and Inventing a Grounded Language as a Consequence , 2016, ArXiv.
 Shimon Whiteson,et al. Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning , 2017, ICML.
 Mykel J. Kochenderfer,et al. Cooperative Multi-agent Control Using Deep Reinforcement Learning , 2017, AAMAS Workshops.
 Nando de Freitas,et al. Sample Efficient Actor-Critic with Experience Replay , 2016, ICLR.
 Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.
 Dorian Kodelja,et al. Multiagent cooperation and competition with deep reinforcement learning , 2015, PloS one.
 Jun Wang,et al. Multiagent Bidirectionally-Coordinated Nets for Learning to Play StarCraft Combat Games , 2017, ArXiv.
 Jonathan P. How,et al. Deep Decentralized Multi-task Multi-Agent Reinforcement Learning under Partial Observability , 2017, ICML.
 Jonathan P. How,et al. Deep Decentralized Multi-task Multi-Agent RL under Partial Observability , 2017 .
 Stefan Lee,et al. Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
 Joel Z. Leibo,et al. Multi-agent Reinforcement Learning in Sequential Social Dilemmas , 2017, AAMAS.
 Alexander Peysakhovich,et al. Multi-Agent Cooperation and the Emergence of (Natural) Language , 2016, ICLR.
 Pieter Abbeel,et al. Emergence of Grounded Compositional Language in Multi-Agent Populations , 2017, AAAI.