Teaching a Robot Tasks of Arbitrary Complexity via Human Feedback
暂无分享,去创建一个
Michael L. Littman | Mark K. Ho | Carl Trimbach | Guan Wang | Jun Ki Lee | M. Littman | Guan Wang | Carl Trimbach | Jun Ki Lee
[1] David L. Roberts,et al. A Strategy-Aware Technique for Learning Behaviors from Discrete Human Feedback , 2014, AAAI.
[2] Fiery Cushman,et al. Teaching with Rewards and Punishments: Reinforcement or Communication? , 2015, CogSci.
[3] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[4] Shen Li,et al. Bayesian Inference of Temporal Task Specifications from Demonstrations , 2018, NeurIPS.
[5] Ufuk Topcu,et al. Environment-Independent Task Specifications via GLTL , 2017, ArXiv.
[6] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[7] Nan Jiang,et al. Repeated Inverse Reinforcement Learning , 2017, NIPS.
[8] Alberto Camacho,et al. Learning Interpretable Models Expressed in Linear Temporal Logic , 2019, ICAPS.
[9] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[10] Angelo Ferrando,et al. Comparing Trace Expressions and Linear Temporal Logic for Runtime Verification , 2016, Theory and Practice of Formal Methods.
[11] Michael L. Littman,et al. Reinforcement learning improves behaviour from evaluative feedback , 2015, Nature.
[12] Peter Stone,et al. Learning non-myopically from human-generated reward , 2013, IUI '13.
[13] Hadas Kress-Gazit,et al. Temporal-Logic-Based Reactive Mission and Motion Planning , 2009, IEEE Transactions on Robotics.
[14] Richard L. Lewis,et al. Where Do Rewards Come From , 2009 .
[15] Michael L. Littman,et al. Apprenticeship Learning About Multiple Intentions , 2011, ICML.
[16] Michèle Sebag,et al. Preference-Based Policy Learning , 2011, ECML/PKDD.
[17] Peter Stone,et al. Interactively shaping agents via human reinforcement: the TAMER framework , 2009, K-CAP '09.
[18] Matthias Scheutz,et al. Interpretable apprenticeship learning with temporal logic specifications , 2017, 2017 IEEE 56th Annual Conference on Decision and Control (CDC).
[19] Anca D. Dragan,et al. Cooperative Inverse Reinforcement Learning , 2016, NIPS.
[20] Mark K. Ho,et al. Social is special: A normative framework for teaching with and learning from evaluative feedback , 2017, Cognition.
[21] Shane Legg,et al. Deep Reinforcement Learning from Human Preferences , 2017, NIPS.
[22] Peter Stone,et al. Cobot in LambdaMOO: An Adaptive Social Statistics Agent , 2006, Autonomous Agents and Multi-Agent Systems.
[23] Andrea Lockerd Thomaz,et al. Teachable robots: Understanding human teaching behavior to build more effective robot learners , 2008, Artif. Intell..
[24] Matthew E. Taylor,et al. Curriculum Design for Machine Learners in Sequential Decision Tasks , 2017, IEEE Transactions on Emerging Topics in Computational Intelligence.
[25] R.L. Rivest,et al. A Formal Model of Hierarchical Concept Learning , 1994, Inf. Comput..
[26] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[27] Reid G. Simmons,et al. Complexity Analysis of Real-Time Reinforcement Learning , 1993, AAAI.
[28] Javier Ruiz-del-Solar,et al. A fast hybrid reinforcement learning framework with human corrective feedback , 2018, Auton. Robots.
[29] Johannes Fürnkranz,et al. Preference-Based Reinforcement Learning: A Preliminary Survey , 2013 .
[30] Guan Wang,et al. Interactive Learning from Policy-Dependent Human Feedback , 2017, ICML.
[31] Daniel Neider,et al. Learning Linear Temporal Properties , 2018, 2018 Formal Methods in Computer Aided Design (FMCAD).
[32] John R. Koza,et al. Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.
[33] Craig Boutilier,et al. Rewarding Behaviors , 1996, AAAI/IAAI, Vol. 2.