Active Reward Learning
暂无分享,去创建一个
Oliver Kroemer | Jan Peters | Christian Daniel | Malte Viering | Jan Metz | Jan Peters | Christian Daniel | Oliver Kroemer | Malte Viering | Jan Metz
[1] W. R. Garner,et al. The effect of presenting various numbers of discrete steps on scale reading accuracy. , 1951, Journal of experimental psychology.
[2] G. A. Miller. The magical number seven plus or minus two: some limits on our capacity for processing information. , 1956, Psychological review.
[3] Marco Colombetti,et al. Robot Shaping: Developing Autonomous Agents Through Learning , 1994, Artif. Intell..
[4] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[5] Jun Nakanishi,et al. Learning Attractor Landscapes for Learning Motor Primitives , 2002, NIPS.
[6] Ben Tse,et al. Autonomous Inverted Helicopter Flight via Reinforcement Learning , 2004, ISER.
[7] Wei Chu,et al. Preference learning with Gaussian processes , 2005, ICML.
[8] Shie Mannor,et al. Reinforcement learning with Gaussian processes , 2005, ICML.
[9] J. Andrew Bagnell,et al. Maximum margin planning , 2006, ICML.
[10] Andrew G. Barto,et al. Autonomous shaping: knowledge transfer in reinforcement learning , 2006, ICML.
[11] Mohammad Ghavamzadeh,et al. Bayesian Policy Gradient Algorithms , 2006, NIPS.
[12] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[13] S. Ounpraseuth. Gaussian Processes for Machine Learning. Carl Edward Rasmussen and Christopher K. I. Williams , 2008 .
[14] Andrea Lockerd Thomaz,et al. Teachable robots: Understanding human teaching behavior to build more effective robot learners , 2008, Artif. Intell..
[15] Betty J. Mohler,et al. Learning perceptual coupling for motor primitives , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[16] Peter Stone,et al. Interactively shaping agents via human reinforcement: the TAMER framework , 2009, K-CAP '09.
[17] David Silver,et al. Learning to search: Functional gradient techniques for imitation learning , 2009, Auton. Robots.
[18] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.
[19] Manuel Lopes,et al. Active Learning for Reward Estimation in Inverse Reinforcement Learning , 2009, ECML/PKDD.
[20] Andreas Krause,et al. Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2009, IEEE Transactions on Information Theory.
[21] Darwin G. Caldwell,et al. Robot motor skill coordination with EM-based Reinforcement Learning , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[22] Richard L. Lewis,et al. Intrinsically Motivated Reinforcement Learning: An Evolutionary Perspective , 2010, IEEE Transactions on Autonomous Mental Development.
[23] Oliver Kroemer,et al. Combining active learning and reactive control for robot grasping , 2010, Robotics Auton. Syst..
[24] Yasemin Altun,et al. Relative Entropy Policy Search , 2010 .
[25] Michèle Sebag,et al. Preference-Based Policy Learning , 2011, ECML/PKDD.
[26] Eyke Hüllermeier,et al. Preference-Based Policy Iteration: Leveraging Preference Learning for Reinforcement Learning , 2011, ECML/PKDD.
[27] Nando de Freitas,et al. Portfolio Allocation for Bayesian Optimization , 2010, UAI.
[28] Sam Devlin,et al. Dynamic potential-based reward shaping , 2012, AAMAS.
[29] Maya Cakmak,et al. Designing robot learners that ask good questions , 2012, 2012 7th ACM/IEEE International Conference on Human-Robot Interaction (HRI).
[30] Ling Xu,et al. Physical Human Interactive Guidance: Identifying Grasping Principles From Human-Planned Grasps , 2012, IEEE Trans. Robotics.
[31] Peter K. Allen,et al. Learning grasp stability , 2012, 2012 IEEE International Conference on Robotics and Automation.
[32] Richard L. Lewis,et al. Strong mitigation: nesting search for good policies within search for good reward , 2012, AAMAS.
[33] Michèle Sebag,et al. Interactive Robot Education , 2013 .
[34] Andrea Lockerd Thomaz,et al. Policy Shaping: Integrating Human Feedback with Reinforcement Learning , 2013, NIPS.
[35] Oliver Kroemer,et al. Learning sequential motor tasks , 2013, 2013 IEEE International Conference on Robotics and Automation.
[36] Thorsten Joachims,et al. Learning Trajectory Preferences for Manipulators via Iterative Improvement , 2013, NIPS.