Régis Sabbadin,et al. A Tractable Leader-Follower MDP Model for Animal Disease Management , 2013, AAAI.
 John Langford,et al. Approximately Optimal Approximate Reinforcement Learning , 2002, ICML.
 Fei Sha,et al. Actor-Attention-Critic for Multi-Agent Reinforcement Learning , 2019, ICML.
 George J. Pappas,et al. Taxi Dispatch With Real-Time Sensing Data in Metropolitan Areas: A Receding Horizon Control Approach , 2016, IEEE Transactions on Automation Science and Engineering.
 Sergio Valcarcel Macua,et al. Coordinating the Crowd: Inducing Desirable Equilibria in Non-Cooperative Systems , 2019, AAMAS.
 Alex Graves,et al. Strategic Attentive Writer for Learning Macro-Actions , 2016, NIPS.
 Joel Z. Leibo,et al. A Generalised Method for Empirical Game Theoretic Analysis , 2018, AAMAS.
 Utkarsh Upadhyay,et al. Deep Reinforcement Learning of Marked Temporal Point Processes , 2018, NeurIPS.
 Csaba Szepesvári,et al. Fitted Q-iteration in continuous action-space MDPs , 2007, NIPS.
 Yan Zheng,et al. A Deep Bayesian Policy Reuse Approach Against Non-Stationary Agents , 2018, NeurIPS.
 Doina Precup,et al. Intra-Option Learning about Temporally Abstract Actions , 1998, ICML.
 Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.
 Alexandre Alahi,et al. Crowd-Robot Interaction: Crowd-Aware Robot Navigation With Attention-Based Deep Reinforcement Learning , 2019, 2019 International Conference on Robotics and Automation (ICRA).
 Claudia V. Goldman,et al. Solving Transition Independent Decentralized Markov Decision Processes , 2004, J. Artif. Intell. Res..
 Chi Cheng,et al. A multi-agent reinforcement learning algorithm based on Stackelberg game , 2017, 2017 6th Data Driven Control and Learning Systems (DDCLS).
 Philip S. Thomas,et al. Learning Action Representations for Reinforcement Learning , 2019, ICML.
 S. Bhattacharyya,et al. Leader-Follower semi-Markov Decision Problems: Theoretical Framework and Approximate Solution , 2007, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning.
 Shimon Whiteson,et al. DAC: The Double Actor-Critic Architecture for Learning Options , 2019, NeurIPS.
 Jürgen Schmidhuber,et al. Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.
 Joelle Pineau,et al. An Inference-Based Policy Gradient Method for Learning Options , 2018, ICML.
 Lillian J. Ratliff,et al. Convergence of Learning Dynamics in Stackelberg Games , 2019, ArXiv.
 Akshat Kumar,et al. Planning and Learning for Decentralized MDPs With Event Driven Rewards , 2018, AAAI.
 Nicolas Le Roux,et al. The Value Function Polytope in Reinforcement Learning , 2019, ICML.
 Jan Peters,et al. Probabilistic inference for determining options in reinforcement learning , 2016, Machine Learning.
 Alan Fern,et al. Learning and Transferring Roles in Multi-Agent Reinforcement , 2008 .