Learning Hierarchical Teaching Policies for Cooperative Agents
暂无分享,去创建一个
Jonathan P. How | Gerald Tesauro | Murray Campbell | Miao Liu | Golnaz Habibi | Sebastian Lopez-Cot | Matthew Riemer | Sami Mourad | Shayegan Omidshafiei | Dong-Ki Kim
[1] Shimon Whiteson,et al. Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.
[2] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[3] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.
[4] Joshua B. Tenenbaum,et al. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.
[5] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[6] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[7] Qiang Liu,et al. Learning to Explore with Meta-Policy Gradient , 2018, ICML 2018.
[8] J. Stenton,et al. Learning how to teach. , 1973, Nursing mirror and midwives journal.
[9] Nan Jiang,et al. Hierarchical Imitation and Reinforcement Learning , 2018, ICML.
[10] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[11] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[12] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[13] Felipe Leno da Silva,et al. Simultaneously Learning and Advising in Multiagent Reinforcement Learning , 2017, AAMAS.
[14] Alex Graves,et al. Automated Curriculum Learning for Neural Networks , 2017, ICML.
[15] Ben Poole,et al. Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.
[16] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[17] Diego Perez Liebana,et al. Teaching on a Budget in Multi-Agent Deep Reinforcement Learning , 2019, 2019 IEEE Conference on Games (CoG).
[18] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.
[19] J. Andrew Bagnell,et al. Reinforcement and Imitation Learning via Interactive No-Regret Learning , 2014, ArXiv.
[20] Andrea Lockerd Thomaz,et al. Reinforcement Learning with Human Teachers: Evidence of Feedback and Guidance with Implications for Learning Performance , 2006, AAAI.
[21] Jun Wang,et al. Multi-Agent Reinforcement Learning , 2020, Deep Reinforcement Learning.
[22] Yi Wu,et al. Robust Multi-Agent Reinforcement Learning via Minimax Deep Deterministic Policy Gradient , 2019, AAAI.
[23] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..
[24] E. Rogers,et al. Diffusion of innovations , 1964, Encyclopedia of Sport Management.
[25] Gerald Tesauro,et al. Learning Abstract Options , 2018, NeurIPS.
[26] Ofra Amir,et al. Interactive Teaching Strategies for Agent Training , 2016, IJCAI.
[27] Jonathan P. How,et al. Learning to Teach in Cooperative Multiagent Reinforcement Learning , 2018, AAAI.
[28] Felipe Leno da Silva,et al. A Survey on Transfer Learning for Multiagent Reinforcement Learning Systems , 2019, J. Artif. Intell. Res..
[29] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[30] Matthew E. Taylor,et al. Teaching on a budget: agents advising agents in reinforcement learning , 2013, AAMAS.
[31] Paul E. Utgoff,et al. On integrating apprentice learning and reinforcement learning , 1996 .
[32] Marcus Hutter,et al. Reinforcement learning with value advice , 2014, ACML.
[33] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[34] Qiang Liu,et al. Learning to Explore via Meta-Policy Gradient , 2018, ICML.
[35] Li Fei-Fei,et al. MentorNet: Learning Data-Driven Curriculum for Very Deep Neural Networks on Corrupted Labels , 2017, ICML.
[36] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[37] Sergey Levine,et al. Data-Efficient Hierarchical Reinforcement Learning , 2018, NeurIPS.
[38] Yulia Tsvetkov,et al. Learning the Curriculum with Bayesian Optimization for Task-Specific Word Representation Learning , 2016, ACL.
[39] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[40] Yisong Yue,et al. Coordinated Multi-Agent Imitation Learning , 2017, ICML.
[41] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.
[42] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[43] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..