Hierarchical Reinforcement Learning
暂无分享,去创建一个
Ah-Hwee Tan | Hiok Chai Quek | Budhitama Subagdja | Shubham Pateria | Budhitama Subagdja | A. Tan | Shubham Pateria
[1] André da Motta Salles Barreto,et al. Graph-Based Skill Acquisition For Reinforcement Learning , 2019, ACM Comput. Surv..
[2] Willi-Hans Steeb,et al. Finite State Machines , 2001 .
[3] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[4] L. Baum,et al. A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .
[5] Ah-Hwee Tan,et al. Multi-agent Reinforcement Learning in Spatial Domain Tasks using Inter Subtask Empowerment Rewards , 2019, 2019 IEEE Symposium Series on Computational Intelligence (SSCI).
[6] Andrew G. Barto,et al. Skill Characterization Based on Betweenness , 2008, NIPS.
[7] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[8] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[9] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.
[10] Kate Saenko,et al. Learning Multi-Level Hierarchies with Hindsight , 2017, ICLR.
[11] Shimon Whiteson,et al. DAC: The Double Actor-Critic Architecture for Learning Options , 2019, NeurIPS.
[12] Lars Niklasson,et al. Time series segmentation using an adaptive resource allocating vector quantization network based on change detection , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.
[13] Doina Precup,et al. When Waiting is not an Option : Learning Options with a Deliberation Cost , 2017, AAAI.
[14] Pieter Abbeel,et al. Meta Learning Shared Hierarchies , 2017, ICLR.
[15] Alicia P. Wolfe,et al. Identifying useful subgoals in reinforcement learning by local graph partitioning , 2005, ICML.
[16] Sergey Levine,et al. Why Does Hierarchy (Sometimes) Work So Well in Reinforcement Learning? , 2019, ArXiv.
[17] Raia Hadsell,et al. CoMic: Complementary Task Learning & Mimicry for Reusable Skills , 2020, ICML.
[18] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.
[19] Peter Dayan,et al. Feudal Multi-Agent Hierarchies for Cooperative Reinforcement Learning , 2019, ICLR 2019.
[20] Andrew G. Barto,et al. Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining , 2009, NIPS.
[21] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[22] Alessandro Lazaric,et al. Transfer in Reinforcement Learning: A Framework and a Survey , 2012, Reinforcement Learning.
[23] Daan Wierstra,et al. Variational Intrinsic Control , 2016, ICLR.
[24] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[25] Chelsea Finn,et al. Language as an Abstraction for Hierarchical Deep Reinforcement Learning , 2019, NeurIPS.
[26] Sridhar Mahadevan,et al. Hierarchical multi-agent reinforcement learning , 2001, AGENTS '01.
[27] Sergey Levine,et al. Learning to Walk via Deep Reinforcement Learning , 2018, Robotics: Science and Systems.
[28] Joshua B. Tenenbaum,et al. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.
[29] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.
[30] Pieter Abbeel,et al. Stochastic Neural Networks for Hierarchical Reinforcement Learning , 2016, ICLR.
[31] Sergey Levine,et al. Data-Efficient Hierarchical Reinforcement Learning , 2018, NeurIPS.
[32] Doina Precup,et al. Learnings Options End-to-End for Continuous Action Tasks , 2017, ArXiv.
[33] Honglak Lee,et al. Hierarchical Reinforcement Learning for Zero-shot Generalization with Subtask Dependencies , 2018, NeurIPS.
[34] Yoshua Bengio,et al. On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.
[35] Marlos C. Machado,et al. A Laplacian Framework for Option Discovery in Reinforcement Learning , 2017, ICML.
[36] Rob Fergus,et al. Learning Goal Embeddings via Self-Play for Hierarchical Reinforcement Learning , 2018, ArXiv.
[37] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[38] Doina Precup,et al. The Option Keyboard: Combining Skills in Reinforcement Learning , 2021, NeurIPS.
[39] Doina Precup,et al. Learning Options with Interest Functions , 2019, AAAI.
[40] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[41] Nahum Shimkin,et al. Unified Inter and Intra Options Learning Using Policy Gradient Methods , 2011, EWRL.
[42] R Bellman,et al. On the Theory of Dynamic Programming. , 1952, Proceedings of the National Academy of Sciences of the United States of America.
[43] Sergey Levine,et al. Near-Optimal Representation Learning for Hierarchical Reinforcement Learning , 2018, ICLR.
[44] Shimon Whiteson,et al. QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning , 2018, ICML.
[45] Ion Stoica,et al. Multi-Level Discovery of Deep Options , 2017, ArXiv.
[46] Jonathan P. How,et al. Learning for Decentralized Control of Multiagent Systems in Large, Partially-Observable Stochastic Environments , 2016, AAAI.
[47] Peter Stone,et al. The utility of temporal abstraction in reinforcement learning , 2008, AAMAS.
[48] Jan Peters,et al. Probabilistic inference for determining options in reinforcement learning , 2016, Machine Learning.
[49] Karol Hausman,et al. Learning an Embedding Space for Transferable Robot Skills , 2018, ICLR.
[50] Ilya Kostrikov,et al. Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play , 2017, ICLR.
[51] Haim Kaplan,et al. Planning in Hierarchical Reinforcement Learning: Guarantees for Using Local Policies , 2019, ALT.
[52] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[53] Mostafa Al-Emran,et al. Hierarchical Reinforcement Learning: A Survey , 2015 .
[54] Hussein A. Abbass,et al. Hierarchical Deep Reinforcement Learning for Continuous Action Control , 2018, IEEE Transactions on Neural Networks and Learning Systems.
[55] Pieter Abbeel,et al. Variational Option Discovery Algorithms , 2018, ArXiv.
[56] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[57] Andrew G. Barto,et al. Building Portable Options: Skill Transfer in Reinforcement Learning , 2007, IJCAI.
[58] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[59] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[60] Shie Mannor,et al. Q-Cut - Dynamic Discovery of Sub-goals in Reinforcement Learning , 2002, ECML.
[61] Samuel Gershman,et al. Deep Successor Reinforcement Learning , 2016, ArXiv.
[62] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[63] Sergey Levine,et al. Search on the Replay Buffer: Bridging Planning and Reinforcement Learning , 2019, NeurIPS.
[64] Shimon Whiteson,et al. Learning to Communicate with Deep Multi-Agent Reinforcement Learning , 2016, NIPS.
[65] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[66] Jürgen Schmidhuber,et al. Hierarchical reinforcement learning with subpolicies specializing for learned subgoals , 2004, Neural Networks and Computational Intelligence.
[67] Magnus Borga,et al. Hierarchical Reinforcement Learning , 1993 .
[68] Li Fei-Fei,et al. CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[69] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[70] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[71] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[72] Sridhar Mahadevan,et al. Learning to Take Concurrent Actions , 2002, NIPS.
[73] Sergey Levine,et al. Self-Consistent Trajectory Autoencoder: Hierarchical Reinforcement Learning with Trajectory Embeddings , 2018, ICML.
[74] Yoshua Bengio,et al. Why Does Unsupervised Pre-training Help Deep Learning? , 2010, AISTATS.
[75] Hongyuan Zha,et al. Hierarchical Cooperative Multi-Agent Reinforcement Learning with Skill Discovery , 2020, AAMAS.
[76] Sergey Levine,et al. Dynamics-Aware Unsupervised Discovery of Skills , 2019, ICLR.
[77] Doina Precup,et al. Options of Interest: Temporal Abstraction with Interest Functions , 2020, AAAI.
[78] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[79] Doina Precup,et al. Learning Options in Reinforcement Learning , 2002, SARA.
[80] Shimon Whiteson,et al. Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.
[81] Tomás Lozano-Pérez,et al. A Framework for Multiple-Instance Learning , 1997, NIPS.
[82] Andrew G. Barto,et al. Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density , 2001, ICML.
[83] Alborz Geramifard,et al. Decentralized control of Partially Observable Markov Decision Processes using belief space macro-actions , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).
[84] Razvan Pascanu,et al. Policy Distillation , 2015, ICLR.
[85] Li Wang,et al. Hierarchical Deep Multiagent Reinforcement Learning , 2018, ArXiv.
[86] Sergey Levine,et al. Deep Reinforcement Learning for Robotic Manipulation , 2016, ArXiv.
[87] D. R. Fulkerson,et al. On the Max Flow Min Cut Theorem of Networks. , 1955 .
[88] Rob Fergus,et al. Learning Multiagent Communication with Backpropagation , 2016, NIPS.
[89] Richard Bellman,et al. Dynamic Programming Treatment of the Travelling Salesman Problem , 1962, JACM.
[90] Sergey Levine,et al. Diversity is All You Need: Learning Skills without a Reward Function , 2018, ICLR.
[91] Gerald Tesauro,et al. Learning Abstract Options , 2018, NeurIPS.
[92] Sergey Levine,et al. Relay Policy Learning: Solving Long-Horizon Tasks via Imitation and Reinforcement Learning , 2019, CoRL.
[93] Shie Mannor,et al. A Deep Hierarchical Approach to Lifelong Learning in Minecraft , 2016, AAAI.
[94] Doina Precup,et al. Option-critic in cooperative multi-agent systems , 2019, AAMAS.
[95] George Konidaris,et al. Option Discovery using Deep Skill Chaining , 2020, ICLR.
[96] Sridhar Mahadevan,et al. Proto-value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes , 2007, J. Mach. Learn. Res..