Sample and Feedback Efficient Hierarchical Reinforcement Learning from Human Preferences
暂无分享,去创建一个
Jan Peters | Gerhard Neumann | Takayuki Osa | Riad Akrour | Robert Pinsler | Jan Peters | G. Neumann | Takayuki Osa | R. Akrour | Robert Pinsler
[1] Yasemin Altun,et al. Relative Entropy Policy Search , 2010 .
[2] L. Thurstone,et al. A low of comparative judgement , 1927 .
[3] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[4] Stefan Schaal,et al. Learning motion primitive goals for robust manipulation , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[5] Oliver Kroemer,et al. Combining active learning and reactive control for robot grasping , 2010, Robotics Auton. Syst..
[6] Eyke Hüllermeier,et al. Preference-Based Policy Iteration: Leveraging Preference Learning for Reinforcement Learning , 2011, ECML/PKDD.
[7] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[8] Johannes Fürnkranz,et al. A Survey of Preference-Based Reinforcement Learning Methods , 2017, J. Mach. Learn. Res..
[9] Johannes Fürnkranz,et al. Model-Free Preference-Based Reinforcement Learning , 2016, AAAI.
[10] Jan Peters,et al. A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.
[11] Alan Fern,et al. A Bayesian Approach for Policy Learning from Trajectory Preference Queries , 2012, NIPS.
[12] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.
[13] Jun Nakanishi,et al. Learning Attractor Landscapes for Learning Motor Primitives , 2002, NIPS.
[14] Paul J. Besl,et al. A Method for Registration of 3-D Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..
[15] Jan Peters,et al. Data-Efficient Generalization of Robot Skills with Contextual Policy Search , 2013, AAAI.
[16] Danica Kragic,et al. Data-Driven Grasp Synthesis—A Survey , 2013, IEEE Transactions on Robotics.
[17] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[18] Máximo A. Roa,et al. Grasp quality measures: review and performance , 2014, Autonomous Robots.
[19] David Hsu,et al. Learning Dynamic Robot-to-Human Object Handover from Human Feedback , 2016, ISRR.
[20] Maya Cakmak,et al. Designing robot learners that ask good questions , 2012, 2012 7th ACM/IEEE International Conference on Human-Robot Interaction (HRI).
[21] Ling Xu,et al. Physical Human Interactive Guidance: Identifying Grasping Principles From Human-Planned Grasps , 2012, IEEE Trans. Robotics.
[22] Nitish Thatte,et al. A Sample-Efficient Black-Box Optimizer to Train Policies for Human-in-the-Loop Systems With User Preferences , 2017, IEEE Robotics and Automation Letters.
[23] Andreas Krause,et al. Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2009, IEEE Transactions on Information Theory.
[24] Oliver Kroemer,et al. Active reward learning with a novel acquisition function , 2015, Auton. Robots.
[25] Wei Chu,et al. Preference learning with Gaussian processes , 2005, ICML.
[26] Jan Peters,et al. Experiments with Hierarchical Reinforcement Learning of Multiple Grasping Policies , 2016, ISER.
[27] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[28] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[29] Michèle Sebag,et al. Programming by Feedback , 2014, ICML.
[30] Andreas Krause,et al. Contextual Gaussian Process Bandit Optimization , 2011, NIPS.
[31] Shane Legg,et al. Deep Reinforcement Learning from Human Preferences , 2017, NIPS.
[32] Christopher K. I. Williams,et al. Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning) , 2005 .
[33] Anis Sahbani,et al. An overview of 3D object grasp synthesis algorithms , 2012, Robotics Auton. Syst..
[34] Nando de Freitas,et al. Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.
[35] Stefan Schaal,et al. Hierarchical reinforcement learning with movement primitives , 2011, 2011 11th IEEE-RAS International Conference on Humanoid Robots.