暂无分享,去创建一个
Ion Stoica | Sanjay Krishnan | Roy Fox | Kenneth Y. Goldberg | Roy Fox | S. Krishnan | Ken Goldberg | I. Stoica
[1] Zoubin Ghahramani,et al. Optimization with EM and Expectation-Conjugate-Gradient , 2003, ICML.
[2] L. Baum,et al. An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process , 1972 .
[3] Joshua B. Tenenbaum,et al. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.
[4] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[5] M. Botvinick,et al. Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective , 2009, Cognition.
[6] Andrew G. Barto,et al. Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density , 2001, ICML.
[7] Jeffrey M. Zacks,et al. Prediction Error Associated with the Perceptual Segmentation of Naturalistic Events , 2011, Journal of Cognitive Neuroscience.
[8] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[9] Yuval Tassa,et al. Learning and Transfer of Modulated Locomotor Controllers , 2016, ArXiv.
[10] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[11] Shie Mannor,et al. Q-Cut - Dynamic Discovery of Sub-goals in Reinforcement Learning , 2002, ECML.
[12] Pieter Abbeel,et al. Stochastic Neural Networks for Hierarchical Reinforcement Learning , 2016, ICLR.
[13] AUTOMATED DISCOVERY OF OPTIONS IN REINFORCEMENT LEARNING , 2003 .
[14] Nahum Shimkin,et al. Unified Inter and Intra Options Learning Using Policy Gradient Methods , 2011, EWRL.
[15] Ronald E. Parr,et al. Hierarchical control and learning for markov decision processes , 1998 .
[16] Brijen Thananjeyan,et al. SWIRL: A SequentialWindowed Inverse Reinforcement Learning Algorithm for Robot Tasks With Delayed Rewards , 2016, Workshop on the Algorithmic Foundations of Robotics.
[17] Pieter Abbeel,et al. Hierarchical Apprenticeship Learning with Application to Quadruped Locomotion , 2007, NIPS.
[18] Sebastian Thrun,et al. Finding Structure in Reinforcement Learning , 1994, NIPS.
[19] Alec Solway,et al. Optimal Behavioral Hierarchy , 2014, PLoS Comput. Biol..
[20] Jordi Grau-Moya,et al. Bounded Rationality, Abstraction, and Hierarchical Decision-Making: An Information-Theoretic Optimality Principle , 2015, Front. Robot. AI.
[21] Roy Fox,et al. Principled Option Learning in Markov Decision Processes , 2016, ArXiv.
[22] Brijen Thananjeyan,et al. SWIRL: A sequential windowed inverse reinforcement learning algorithm for robot tasks with delayed rewards , 2018, Int. J. Robotics Res..
[23] Balaraman Ravindran,et al. Option Discovery in Hierarchical Reinforcement Learning using Spatio-Temporal Clustering , 2016, 1605.05359.
[24] Doina Precup,et al. Learning with options : Just deliberate and relax , 2015 .
[25] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.
[26] Gregory D. Hager,et al. Transition state clustering: Unsupervised surgical trajectory segmentation for robot learning , 2017, ISRR.
[27] M. Botvinick. Hierarchical models of behavior and prefrontal function , 2008, Trends in Cognitive Sciences.
[28] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[29] Scott Kuindersma,et al. Robot learning from demonstration by constructing skill trees , 2012, Int. J. Robotics Res..
[30] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[31] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[32] Sergey Levine,et al. Unsupervised Perceptual Rewards for Imitation Learning , 2016, Robotics: Science and Systems.
[33] Andrew G. Barto,et al. Building Portable Options: Skill Transfer in Reinforcement Learning , 2007, IJCAI.
[34] Vicenç Gómez,et al. Hierarchical Linearly-Solvable Markov Decision Problems , 2016, ICAPS.
[35] Jan Peters,et al. Probabilistic inference for determining options in reinforcement learning , 2016, Machine Learning.
[36] Leslie Pack Kaelbling,et al. Hierarchical Learning in Stochastic Domains: Preliminary Results , 1993, ICML.
[37] Bernhard Hengst,et al. Discovering Hierarchy in Reinforcement Learning with HEXQ , 2002, ICML.
[38] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.
[39] Joseph Gonzalez,et al. Composing Meta-Policies for Autonomous Driving Using Hierarchical Deep Reinforcement Learning , 2017, ArXiv.
[40] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[41] Svetha Venkatesh,et al. Policy Recognition in the Abstract Hidden Markov Model , 2002, J. Artif. Intell. Res..
[42] Roderic A. Grupen,et al. A feedback control structure for on-line learning tasks , 1997, Robotics Auton. Syst..
[43] Alan Fern,et al. Active Imitation Learning of Hierarchical Policies , 2015, IJCAI.
[44] Andrew G. Barto,et al. Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining , 2009, NIPS.
[45] Andrew G. Barto,et al. Using relative novelty to identify useful temporal abstractions in reinforcement learning , 2004, ICML.
[46] Balaraman Ravindran,et al. Learning to Repeat: Fine Grained Action Repetition for Deep Reinforcement Learning , 2017, ICLR.
[47] Jan Peters,et al. Hierarchical Relative Entropy Policy Search , 2014, AISTATS.
[48] Rodney A. Brooks,et al. A Robust Layered Control Syste For A Mobile Robot , 2022 .
[49] A. Whiten,et al. Imitation of hierarchical action structure by young children. , 2006, Developmental science.
[50] Henryk Michalewski,et al. Learning from the memory of Atari 2600 , 2016, CGW@IJCAI.