暂无分享,去创建一个
Peter Vrancx | Ann Nowé | Diederik M. Roijers | Denis Steckelmacher | Anna Harutyunyan | A. Harutyunyan | A. Nowé | Peter Vrancx | Denis Steckelmacher | D. Roijers
[1] D. Cliff. From animals to animats , 1994, Nature.
[2] Sebastian Thrun,et al. Planning under Uncertainty for Reliable Health Care Robotics , 2003, FSR.
[3] Jürgen Schmidhuber,et al. HQ-Learning , 1997, Adapt. Behav..
[4] Shie Mannor,et al. A Deep Hierarchical Approach to Lifelong Learning in Minecraft , 2016, AAAI.
[5] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[6] Jean-Arcady Meyer,et al. Adaptive Behavior , 2005 .
[7] 양정삼. [해외 대학 연구센터 소개] Carnegie Mellon University , 2012 .
[8] Byron Boots,et al. Closing the learning-planning loop with predictive state representations , 2009, Int. J. Robotics Res..
[9] Leslie Pack Kaelbling,et al. Learning Policies with External Memory , 1999, ICML.
[10] Sridhar Mahadevan,et al. Hierarchical learning and planning in partially observable markov decision processes , 2002 .
[11] Joel W. Burdick,et al. Springer Tracts in Advanced Robotics , 2004 .
[12] Richard S. Sutton,et al. Predictive Representations of State , 2001, NIPS.
[13] Jonathan P. How,et al. Decentralized control of multi-robot partially observable Markov decision processes using belief space macro-actions , 2017, Int. J. Robotics Res..
[14] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[15] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[16] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[17] Peter Tino,et al. IEEE Transactions on Neural Networks , 2009 .
[18] Sergio Gomez Colmenarejo,et al. Hybrid computing using a neural network with dynamic external memory , 2016, Nature.
[19] Doina Precup,et al. Temporal abstraction in reinforcement learning , 2000, ICML 2000.
[20] Peter J. Angeline,et al. An evolutionary algorithm that constructs recurrent neural networks , 1994, IEEE Trans. Neural Networks.
[21] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[22] Marc'Aurelio Ranzato,et al. Learning Longer Memory in Recurrent Neural Networks , 2014, ICLR.
[23] M.A. Wiering,et al. Reinforcement Learning in Continuous Action Spaces , 2007, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning.
[24] Tom M. Mitchell,et al. Reinforcement learning with hidden states , 1993 .
[25] Wojciech Zaremba,et al. An Empirical Exploration of Recurrent Network Architectures , 2015, ICML.
[26] Bram Bakker,et al. Reinforcement Learning with Long Short-Term Memory , 2001, NIPS.
[27] Kee-Eung Kim,et al. Learning Finite-State Controllers for Partially Observable Environments , 1999, UAI.
[28] Long Lin,et al. Memory Approaches to Reinforcement Learning in Non-Markovian Domains , 1992 .
[29] René Boel,et al. Discrete event dynamic systems: Theory and applications. , 2002 .
[30] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[31] Leslie Pack Kaelbling,et al. Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.
[32] R. Lathe. Phd by thesis , 1988, Nature.
[33] Wojciech Zaremba,et al. Reinforcement Learning Neural Turing Machines , 2015, ArXiv.
[34] Shimon Whiteson,et al. Point-Based Planning for Multi-Objective POMDPs , 2015, IJCAI.
[35] David Hsu,et al. Monte Carlo Value Iteration with Macro-Actions , 2011, NIPS.
[36] Andrew Zisserman,et al. Advances in Neural Information Processing Systems (NIPS) , 2007 .
[37] M. V. Rossum,et al. In Neural Computation , 2022 .
[38] Kathryn B. Laskey,et al. Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence , 1999 .
[39] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[40] Christian Laugier,et al. The International Journal of Robotics Research (IJRR) - Special issue on ``Field and Service Robotics '' , 2009 .
[41] Richard Dearden,et al. Planning to see: A hierarchical approach to planning visual actions on a robot using POMDPs , 2010, Artif. Intell..
[42] Nicholas Roy,et al. Efficient Planning under Uncertainty with Macro-actions , 2014, J. Artif. Intell. Res..
[43] D. Signorini,et al. Neural networks , 1995, The Lancet.
[44] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..