SWIRL: A sequential windowed inverse reinforcement learning algorithm for robot tasks with delayed rewards
暂无分享,去创建一个
Brijen Thananjeyan | Sanjay Krishnan | Kenneth Y. Goldberg | Animesh Garg | Richard Liaw | Lauren Miller | Florian T. Pokorny | S. Krishnan | Ken Goldberg | Richard Liaw | Animesh Garg | Lauren Miller | Brijen Thananjeyan
[1] Pieter Abbeel,et al. Third-Person Imitation Learning , 2017, ICLR.
[2] Pravesh Ranchod,et al. Nonparametric Bayesian reward segmentation for skill discovery using inverse reinforcement learning , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[3] Andrew G. Barto,et al. Building Portable Options: Skill Transfer in Reinforcement Learning , 2007, IJCAI.
[4] P. Morasso. Three dimensional arm trajectories , 1983, Biological Cybernetics.
[5] Pieter Abbeel,et al. An Algorithmic Perspective on Imitation Learning , 2018, Found. Trends Robotics.
[6] Gunnar Rätsch,et al. Kernel PCA and De-Noising in Feature Spaces , 1998, NIPS.
[7] Gregory D. Hager,et al. Motion generation of robotic surgical tasks: Learning from expert demonstrations , 2010, 2010 Annual International Conference of the IEEE Engineering in Medicine and Biology.
[8] Danica Kragic,et al. Learning Actions from Observations , 2010, IEEE Robotics & Automation Magazine.
[9] Aude Billard,et al. Stochastic gesture production and recognition model for a humanoid robot , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).
[10] Jitendra Malik,et al. Learning to Poke by Poking: Experiential Learning of Intuitive Physics , 2016, NIPS.
[11] Roderic A. Grupen,et al. A feedback control structure for on-line learning tasks , 1997, Robotics Auton. Syst..
[12] Pieter Abbeel,et al. Learning by observation for surgical subtasks: Multilateral cutting of 3D viscoelastic and 2D Orthotropic Tissue Phantoms , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).
[13] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[14] Michael I. Jordan,et al. Nonparametric Bayesian Learning of Switching Linear Dynamical Systems , 2008, NIPS.
[15] Jan Peters,et al. Probabilistic segmentation applied to an assembly task , 2015, 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids).
[16] Sylvain Calinoti,et al. Skills learning in robots by interaction with users and environment , 2014, URAI.
[17] Pieter Abbeel,et al. Learning for control from multiple demonstrations , 2008, ICML '08.
[18] Abhinav Gupta,et al. The Curious Robot: Learning Visual Representations via Physical Interactions , 2016, ECCV.
[19] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[20] Doina Precup,et al. Learning with options : Just deliberate and relax , 2015 .
[21] Henry C. Lin,et al. JHU-ISI Gesture and Skill Assessment Working Set ( JIGSAWS ) : A Surgical Activity Dataset for Human Motion Modeling , 2014 .
[22] Scott Niekum,et al. Learning and generalization of complex tasks from unstructured demonstrations , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[23] Trevor Darrell,et al. TSC-DL: Unsupervised trajectory segmentation of multi-modal surgical demonstrations with Deep Learning , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).
[24] Jan Peters,et al. Learning movement primitive attractor goals and sequential skills from kinesthetic demonstrations , 2015, Robotics Auton. Syst..
[25] Jeffrey M. Zacks,et al. Prediction Error Associated with the Perceptual Segmentation of Naturalistic Events , 2011, Journal of Cognitive Neuroscience.
[26] Jernej Barbic,et al. Segmenting Motion Capture Data into Distinct Behaviors , 2004, Graphics Interface.
[27] Scott Kuindersma,et al. Robot learning from demonstration by constructing skill trees , 2012, Int. J. Robotics Res..
[28] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[29] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[30] Brijen Thananjeyan,et al. SWIRL: A SequentialWindowed Inverse Reinforcement Learning Algorithm for Robot Tasks With Delayed Rewards , 2016, Workshop on the Algorithmic Foundations of Robotics.
[31] Andrew T. Irish,et al. Trajectory Learning for Robot Programming by Demonstration Using Hidden Markov Model and Dynamic Time Warping , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[32] Joshua B. Tenenbaum,et al. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.
[33] Sanjay Krishnan,et al. HIRL: Hierarchical Inverse Reinforcement Learning for Long-Horizon Tasks with Delayed Rewards , 2016, ArXiv.
[34] Michael I. Jordan,et al. Revisiting k-means: New Algorithms via Bayesian Nonparametrics , 2011, ICML.
[35] Ajay Kumar Tanwani,et al. Learning Robot Manipulation Tasks With Task-Parameterized Semitied Hidden Semi-Markov Model , 2016, IEEE Robotics and Automation Letters.
[36] Pieter Abbeel,et al. Hierarchical Apprenticeship Learning with Application to Quadruped Locomotion , 2007, NIPS.
[37] Alec Solway,et al. Optimal Behavioral Hierarchy , 2014, PLoS Comput. Biol..
[38] Tamim Asfour,et al. Imitation Learning of Dual-Arm Manipulation Tasks in Humanoid Robots , 2006, 2006 6th IEEE-RAS International Conference on Humanoid Robots.
[39] Jan Peters,et al. Movement extraction by detecting dynamics switches and repetitions , 2010, NIPS.
[40] Bernhard Schölkopf,et al. Switched Latent Force Models for Movement Segmentation , 2010, NIPS.
[41] Jun Nakanishi,et al. Learning Attractor Landscapes for Learning Motor Primitives , 2002, NIPS.
[42] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[43] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[44] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[45] S. Schaal,et al. Segmentation of endpoint trajectories does not imply segmented control , 1999, Experimental Brain Research.
[46] Sergey Levine,et al. Optimism-driven exploration for nonlinear systems , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).
[47] Giulio Sandini,et al. Imitation learning of non-linear point-to-point robot motions using dirichlet processes , 2012, 2012 IEEE International Conference on Robotics and Automation.
[48] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..
[49] Darwin G. Caldwell,et al. Learning and Reproduction of Gestures by Imitation , 2010, IEEE Robotics & Automation Magazine.
[50] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.
[51] Sebastian Thrun,et al. Issues in Using Function Approximation for Reinforcement Learning , 1999 .
[52] Gregory D. Hager,et al. Transition state clustering: Unsupervised surgical trajectory segmentation for robot learning , 2017, ISRR.
[53] Peter Kazanzides,et al. An open-source research kit for the da Vinci® Surgical System , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).
[54] Thomas B. Moeslund,et al. A Survey of Computer Vision-Based Human Motion Capture , 2001, Comput. Vis. Image Underst..
[55] Sergey Levine,et al. Deep visual foresight for planning robot motion , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[56] Emilio Frazzoli,et al. Incremental Sampling-based Algorithms for Optimal Motion Planning , 2010, Robotics: Science and Systems.
[57] Lucas Monteiro Chaves,et al. ON THE PREDICTION ERROR , 2019 .
[58] Stefan Schaal,et al. Learning and generalization of motor skills by learning from demonstration , 2009, 2009 IEEE International Conference on Robotics and Automation.
[59] M. Botvinick,et al. Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective , 2009, Cognition.
[60] Abhinav Gupta,et al. Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).
[61] Mirko Wächter,et al. Hierarchical segmentation of manipulation actions based on object relations and motion characteristics , 2015, 2015 International Conference on Advanced Robotics (ICAR).
[62] Alan Fern,et al. Imitation Learning with Demonstrations and Shaping Rewards , 2014, AAAI.
[63] Anind K. Dey,et al. Probabilistic pointing target prediction via inverse optimal control , 2012, IUI '12.
[64] Jun Morimoto,et al. Task-Specific Generalization of Discrete and Periodic Dynamic Movement Primitives , 2010, IEEE Transactions on Robotics.
[65] Aude Billard,et al. Learning Stable Nonlinear Dynamical Systems With Gaussian Mixture Models , 2011, IEEE Transactions on Robotics.
[66] Leslie Pack Kaelbling,et al. Hierarchical Learning in Stochastic Domains: Preliminary Results , 1993, ICML.
[67] Sang Hyoung Lee,et al. Autonomous framework for segmenting robot trajectories of manipulation task , 2015, Auton. Robots.
[68] M. Botvinick. Hierarchical models of behavior and prefrontal function , 2008, Trends in Cognitive Sciences.
[69] P Viviani,et al. Segmentation and coupling in complex movements. , 1985, Journal of experimental psychology. Human perception and performance.
[70] Rodney A. Brooks,et al. A Robust Layered Control Syste For A Mobile Robot , 2022 .
[71] A. Whiten,et al. Imitation of hierarchical action structure by young children. , 2006, Developmental science.
[72] John N. Tsitsiklis,et al. Neuro-dynamic programming: an overview , 1995, Proceedings of 1995 34th IEEE Conference on Decision and Control.
[73] Bernhard Hengst,et al. Discovering Hierarchy in Reinforcement Learning with HEXQ , 2002, ICML.
[74] Jonathan Lee,et al. Iterative Noise Injection for Scalable Imitation Learning , 2017, ArXiv.
[75] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[76] Sergey Levine,et al. Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization , 2016, ICML.
[77] Sergey Levine,et al. Learning Hand-Eye Coordination for Robotic Grasping with Large-Scale Data Collection , 2016, ISER.
[78] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.