Inapplicable Actions Learning for Knowledge Transfer in Reinforcement Learning
暂无分享,去创建一个
[1] Sumitra Ganesh,et al. Factored Policy Gradients: Leveraging Structure for Efficient Learning in MOMDPs , 2021, NeurIPS.
[2] Shengyi Huang,et al. A Closer Look at Invalid Action Masking in Policy Gradient Algorithms , 2020, FLAIRS.
[3] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[4] Hector Geffner,et al. Learning First-Order Symbolic Representations for Planning from the Structure of the State Space , 2019, ECAI.
[5] Tom Schaul,et al. Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement , 2018, ICML.
[6] Shie Mannor,et al. Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning , 2018, NeurIPS.
[7] Max Welling,et al. Attention, Learn to Solve Routing Problems! , 2018, ICLR.
[8] Michael I. Jordan,et al. RLlib: Abstractions for Distributed Reinforcement Learning , 2017, ICML.
[9] Tom Schaul,et al. StarCraft II: A New Challenge for Reinforcement Learning , 2017, ArXiv.
[10] Alex S. Fukunaga,et al. Classical Planning in Deep Latent Space: Bridging the Subsymbolic-Symbolic Boundary , 2017, AAAI.
[11] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.
[12] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[13] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[14] Javier García,et al. Probabilistic Policy Reuse for inter-task transfer learning , 2010, Robotics Auton. Syst..
[15] Andrew K. C. Wong,et al. Classification of Imbalanced Data: a Review , 2009, Int. J. Pattern Recognit. Artif. Intell..
[16] Shie Mannor,et al. Action Elimination and Stopping Conditions for the Multi-Armed Bandit and Reinforcement Learning Problems , 2006, J. Mach. Learn. Res..
[17] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[18] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[19] Richard Fikes,et al. STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving , 1971, IJCAI.
[20] A. Gleave,et al. Stable-Baselines3: Reliable Reinforcement Learning Implementations , 2021, J. Mach. Learn. Res..
[21] Stefan Edelkamp,et al. Automated Planning: Theory and Practice , 2007, Künstliche Intell..