SAIL: Simulation-Informed Active In-the-Wild Learning
暂无分享,去创建一个
[1] Emre Ugur,et al. Goal emulation and planning in perceptual space using learned affordances , 2011, Robotics Auton. Syst..
[2] Subbarao Kambhampati,et al. Generating diverse plans to handle unknown and partially known user preferences , 2012, Artif. Intell..
[3] Zhou Yu,et al. Are you messing with me? Querying about the sincerity of interactions in the open world , 2016, 2016 11th ACM/IEEE International Conference on Human-Robot Interaction (HRI).
[4] Anca D. Dragan,et al. Learning from Physical Human Corrections, One Feature at a Time , 2018, 2018 13th ACM/IEEE International Conference on Human-Robot Interaction (HRI).
[5] Shalabh Bhatnagar,et al. Natural actorcritic algorithms. , 2009 .
[6] Patrick Jaillet,et al. Sampling Based Approaches for Minimizing Regret in Uncertain Markov Decision Processes (MDPs) , 2017, J. Artif. Intell. Res..
[7] Maya Cakmak,et al. Trajectories and keyframes for kinesthetic teaching: A human-robot interaction perspective , 2012, 2012 7th ACM/IEEE International Conference on Human-Robot Interaction (HRI).
[8] Anna Helena Reali Costa,et al. A Geometric Approach to Find Nondominated Policies to Imprecise Reward MDPs , 2011, ECML/PKDD.
[9] Brian Scassellati,et al. Discovering task constraints through observation and active learning , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[10] Jan Peters,et al. Learning movement primitive libraries through probabilistic segmentation , 2017, Int. J. Robotics Res..
[11] Yann Chevaleyre,et al. Advantage based value iteration for Markov decision processes with unknown rewards , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).
[12] Andrea Lockerd Thomaz,et al. Policy Shaping: Integrating Human Feedback with Reinforcement Learning , 2013, NIPS.
[13] Andrea Lockerd Thomaz,et al. Simultaneously learning actions and goals from demonstration , 2016, Auton. Robots.
[14] Martin A. Riedmiller,et al. Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards , 2017, ArXiv.
[15] Craig Boutilier,et al. Robust Policy Computation in Reward-Uncertain MDPs Using Nondominated Policies , 2010, AAAI.
[16] Maya Cakmak,et al. Designing robot learners that ask good questions , 2012, 2012 7th ACM/IEEE International Conference on Human-Robot Interaction (HRI).
[17] Peter Stone,et al. Reinforcement learning from simultaneous human and MDP reward , 2012, AAMAS.
[18] Aude Billard,et al. Donut as I do: Learning from failed demonstrations , 2011, 2011 IEEE International Conference on Robotics and Automation.
[19] Manfred Tscheligi,et al. Robots asking for directions — The willingness of passers-by to support robots , 2010, 2010 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI).
[20] John N. Tsitsiklis,et al. Actor-Critic Algorithms , 1999, NIPS.
[21] Nan Jiang,et al. Repeated Inverse Reinforcement Learning , 2017, NIPS.
[22] Andreas Krause,et al. Advances in Neural Information Processing Systems (NIPS) , 2014 .
[23] Owain Evans,et al. Trial without Error: Towards Safe Reinforcement Learning via Human Intervention , 2017, AAMAS.
[24] Bruno Zanuttini,et al. An Experimental Study of Advice in Sequential Decision-Making Under Uncertainty , 2018, AAAI.
[25] Sergey Levine,et al. Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection , 2016, Int. J. Robotics Res..
[26] Peter Stone,et al. Interactively shaping agents via human reinforcement: the TAMER framework , 2009, K-CAP '09.
[27] Shane Legg,et al. Deep Reinforcement Learning from Human Preferences , 2017, NIPS.