A Deep Hierarchical Reinforcement Learning Algorithm in Partially Observable Markov Decision Processes
暂无分享,去创建一个
[1] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[2] E. Deci,et al. Intrinsic and Extrinsic Motivations: Classic Definitions and New Directions. , 2000, Contemporary educational psychology.
[3] Joshua B. Tenenbaum,et al. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.
[4] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.
[5] John N. Tsitsiklis,et al. Actor-Critic Algorithms , 1999, NIPS.
[6] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[7] Sergey Levine,et al. Continuous Deep Q-Learning with Model-based Acceleration , 2016, ICML.
[8] Andrew G. Barto,et al. Intrinsically Motivated Reinforcement Learning: A Promising Framework for Developmental Robot Learning , 2005 .
[9] Ion Stoica,et al. Multi-Level Discovery of Deep Options , 2017, ArXiv.
[10] Murray Shanahan,et al. Classifying Options for Deep Reinforcement Learning , 2016, ArXiv.
[11] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[12] Peter Stone,et al. Deep Recurrent Q-Learning for Partially Observable MDPs , 2015, AAAI Fall Symposia.
[13] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..
[14] Chelsea C. White,et al. Procedures for the Solution of a Finite-Horizon, Partially Observed, Semi-Markov Optimization Problem , 1976, Oper. Res..
[15] Andre Cohen,et al. An object-oriented representation for efficient reinforcement learning , 2008, ICML '08.
[16] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.
[17] Maxim Egorov,et al. Deep Reinforcement Learning with POMDPs , 2015 .
[18] Jürgen Schmidhuber,et al. Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990–2010) , 2010, IEEE Transactions on Autonomous Mental Development.
[19] Sridhar Mahadevan,et al. Deep Reinforcement Learning With Macro-Actions , 2016, ArXiv.
[20] Eduardo F. Morales,et al. An Introduction to Reinforcement Learning , 2011 .
[21] Stefan Schaal,et al. Policy Gradient Methods for Robotics , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[22] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[23] Youyong Kong,et al. Deep Direct Reinforcement Learning for Financial Signal Representation and Trading , 2017, IEEE Transactions on Neural Networks and Learning Systems.
[24] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[25] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[26] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..
[27] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.
[28] Jürgen Leitner,et al. Curiosity driven reinforcement learning for motion planning on humanoids , 2014, Front. Neurorobot..
[29] Byoung-Tak Zhang,et al. Micro-Objective Learning : Accelerating Deep Reinforcement Learning through the Discovery of Continuous Subgoals , 2017, ArXiv.
[30] Shakir Mohamed,et al. Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning , 2015, NIPS.
[31] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[32] Rob Fergus,et al. MazeBase: A Sandbox for Learning from Games , 2015, ArXiv.
[33] Nuttapong Chentanez,et al. Intrinsically Motivated Learning of Hierarchical Collections of Skills , 2004 .
[34] Von-Wun Soo,et al. Subgoal Identifications in Reinforcement Learning: A Survey , 2011 .
[35] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[36] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[37] John J. Grefenstette,et al. Evolutionary Algorithms for Reinforcement Learning , 1999, J. Artif. Intell. Res..
[38] Gerald Tesauro,et al. Simulation, learning, and optimization techniques in Watson's game strategies , 2012, IBM J. Res. Dev..
[39] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[40] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[41] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[42] Gerald Tesauro,et al. TD-Gammon: A Self-Teaching Backgammon Program , 1995 .
[43] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[44] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[45] TaeChoong Chung,et al. Bayes-adaptive hierarchical MDPs , 2015, Applied Intelligence.
[46] Ronald E. Parr,et al. Hierarchical control and learning for markov decision processes , 1998 .
[47] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[48] Kevin P. Murphy,et al. A Survey of POMDP Solution Techniques , 2000 .
[49] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[50] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[51] D. Ernst,et al. Power systems stability control: reinforcement learning framework , 2004, IEEE Transactions on Power Systems.
[52] Sungyoung Lee,et al. Approximate planning for bayesian hierarchical reinforcement learning , 2014, Applied Intelligence.
[53] Marc Toussaint,et al. Hierarchical Monte-Carlo Planning , 2015, AAAI.
[54] Jan Peters,et al. A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.
[55] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[56] Richard L. Lewis,et al. Intrinsically Motivated Reinforcement Learning: An Evolutionary Perspective , 2010, IEEE Transactions on Autonomous Mental Development.
[57] Guillaume Lample,et al. Playing FPS Games with Deep Reinforcement Learning , 2016, AAAI.
[58] Matthew Saffell,et al. Learning to trade via direct reinforcement , 2001, IEEE Trans. Neural Networks.
[59] AUTOMATED DISCOVERY OF OPTIONS IN REINFORCEMENT LEARNING , 2003 .
[60] Geoffrey E. Hinton,et al. Using Expectation-Maximization for Reinforcement Learning , 1997, Neural Computation.
[61] Jae Won Lee,et al. Stock price prediction using reinforcement learning , 2001, ISIE 2001. 2001 IEEE International Symposium on Industrial Electronics Proceedings (Cat. No.01TH8570).
[62] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.