Guided teaching interactions with robots: embodied queries and teaching heuristics

We propose to use concepts from algorithmic teaching to measure and improve human teaching for machine learners. We first investigate input examples produced by human teachers in comparison to optimal or useful teaching sequences, and find that human teachers do not naturally generate the best learning examples. Then we provide humans with teaching guidance in the form of step-by-step teaching strategy or a generic teaching heuristic, to elicit better teaching. We present results for both experiments on three different problems, showing that everyday human teachers are not naturally optimal from a machine learning perspective, but teaching g idance significantly improves their i put. This provides promising evidence that human intelligence and flexibility can be leveraged to achieve better sample efficiency when input data to a learning algorithm comes from a human teacher.

[1]  Maya Cakmak,et al.  Eliciting good teaching from humans for machine learners , 2014, Artif. Intell..

[2]  Manuel Lopes,et al.  Algorithmic and Human Teaching of Sequential Decision Tasks , 2012, AAAI.

[3]  Maya Cakmak,et al.  Keyframe-based Learning from Demonstration , 2012, Int. J. Soc. Robotics.

[4]  Maya Cakmak,et al.  Trajectories and keyframes for kinesthetic teaching: A human-robot interaction perspective , 2012, 2012 7th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[5]  Maya Cakmak,et al.  Designing robot learners that ask good questions , 2012, 2012 7th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[6]  A. Thomaz,et al.  Mixed-Initiative Active Learning , 2012 .

[7]  Bilge Mutlu,et al.  How Do Humans Teach: On Curriculum Learning and Teaching Dimension , 2011, NIPS.

[8]  Matthias Scheutz,et al.  Learning actions from human-robot dialogues , 2011, 2011 RO-MAN.

[9]  Stephanie Rosenthal,et al.  Modeling humans as observation providers using POMDPs , 2011, 2011 RO-MAN.

[10]  Jeff A. Bilmes,et al.  Simultaneous Learning and Covering with Adversarial Noise , 2011, ICML.

[11]  Marc Toussaint,et al.  Task Space Retrieval Using Inverse Feedback Control , 2011, ICML.

[12]  Yong Jae Lee,et al.  Learning the easy things first: Self-paced visual category discovery , 2011, CVPR 2011.

[13]  Çetin Meriçli,et al.  Task Refinement for Autonomous Robots Using Complementary Corrective Human Feedback , 2011 .

[14]  Aude Billard,et al.  Donut as I do: Learning from failed demonstrations , 2011, 2011 IEEE International Conference on Robotics and Automation.

[15]  Andrea Lockerd Thomaz,et al.  The shape of Simon: creative design of a humanoid robot shell , 2011, CHI Extended Abstracts.

[16]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[17]  Eric Yeh,et al.  Learning to ask the right questions to help a learner learn , 2011, IUI '11.

[18]  Alan Fern,et al.  Active Imitation Learning via State Queries , 2011 .

[19]  A. Thomaz,et al.  Active Learning with Mixed Query Types in Learning from Demonstration , 2011 .

[20]  Carme Torras,et al.  Robot learning from demonstration in the force domain , 2011, IJCAI 2011.

[21]  Patrick Shafto,et al.  Reasoning in teaching and misleading situations , 2011, CogSci.

[22]  Maya Cakmak,et al.  Optimality of human teachers for robot learners , 2010, 2010 IEEE 9th International Conference on Development and Learning.

[23]  Eric L. Sauser,et al.  Tactile guidance for policy refinement and reuse , 2010, 2010 IEEE 9th International Conference on Development and Learning.

[24]  Maya Cakmak,et al.  Designing Interactions for Robot Active Learners , 2010, IEEE Transactions on Autonomous Mental Development.

[25]  Peter Stone,et al.  Training a Tetris agent via interactive shaping: a demonstration of the TAMER framework , 2010, AAMAS.

[26]  Maria-Florina Balcan,et al.  The true sample complexity of active learning , 2010, Machine Learning.

[27]  Desney S. Tan,et al.  Interactive optimization for steering machine classification , 2010, CHI.

[28]  Bill Tomlinson,et al.  Who are the crowdworkers?: shifting demographics in mechanical turk , 2010, CHI Extended Abstracts.

[29]  A. Thomaz,et al.  Transparent active learning for robots , 2010, 2010 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[30]  Stephanie Rosenthal,et al.  Towards maximizing the accuracy of human-labeled sensor data , 2010, IUI '10.

[31]  Pierre-Yves Oudeyer,et al.  Intrinsically Motivated Exploration for Developmental and Active Sensorimotor Learning , 2010, From Motor Learning to Interaction Learning in Robots.

[32]  Active Learning and Intrinsically Motivated Exploration in Robots : Advances and Challenges , 2010 .

[33]  A. Billard,et al.  An Active Learning Interface for Bootstrapping Robot ’ s Generalization Abilities in Learning from Demonstration , 2010 .

[34]  A. Billard,et al.  Learning Stable Nonlinear Dynamical Systems With Gaussian Mixture Models , 2011, IEEE Transactions on Robotics.

[35]  Martin Bossert,et al.  On learning Boolean functions and punctured Reed-Muller-codes , 2009, 2009 IEEE Information Theory Workshop.

[36]  Xiaojin Zhu,et al.  Human Rademacher Complexity , 2009, NIPS.

[37]  Csaba Szepesvári,et al.  Training parsers by inverse reinforcement learning , 2009, Machine Learning.

[38]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[39]  Stephanie Rosenthal,et al.  How robots' questions affect the accuracy of the human responses , 2009, RO-MAN 2009 - The 18th IEEE International Symposium on Robot and Human Interactive Communication.

[40]  Jochen J. Steil,et al.  Automatic selection of task spaces for imitation learning , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[41]  Oliver Kroemer,et al.  Active learning using mean shift optimization for robot grasping , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[42]  Manuel Lopes,et al.  Active Learning for Reward Estimation in Inverse Reinforcement Learning , 2009, ECML/PKDD.

[43]  Andrew McCallum,et al.  Active Learning by Labeling Features , 2009, EMNLP.

[44]  Thomas G. Dietterich,et al.  Interacting meaningfully with machine learning systems: Three experiments , 2009, Int. J. Hum. Comput. Stud..

[45]  Sudheendra Vijayanarasimhan,et al.  What's it going to cost you?: Predicting effort vs. informativeness for multi-label image annotations , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[46]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[47]  Stefan Schaal,et al.  Learning and generalization of motor skills by learning from demonstration , 2009, 2009 IEEE International Conference on Robotics and Automation.

[48]  Jochen J. Steil,et al.  Task-level imitation learning using variance-based movement optimization , 2009, 2009 IEEE International Conference on Robotics and Automation.

[49]  Brett Browning,et al.  A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[50]  Abhijit Gosavi,et al.  Reinforcement Learning: A Tutorial Survey and Recent Advances , 2009, INFORMS J. Comput..

[51]  Thomas Zeugmann,et al.  Recent Developments in Algorithmic Teaching , 2009, LATA.

[52]  Maya Cakmak,et al.  Learning about objects with human teachers , 2009, 2009 4th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[53]  Aude Billard,et al.  Statistical Learning by Imitation of Competing Constraints in Joint Space and Task Space , 2009, Adv. Robotics.

[54]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[55]  Sylvain Calino,et al.  Robot programming by demonstration : a probabilistic approach , 2009 .

[56]  Matthew E. Taylor Assisting Transfer-Enabled Machine Learning Algorithms: Leveraging Human Knowledge for Curriculum Design , 2009, AAAI Spring Symposium: Agents that Learn from Human Teachers.

[57]  Robert D. Nowak,et al.  Human Active Learning , 2008, NIPS.

[58]  C. Breazeal,et al.  Experiments in socially guided exploration: lessons learned in building robots that learn with and without human teachers , 2008, Connect. Sci..

[59]  Andrea Lockerd Thomaz,et al.  Learning from human teachers with Socially Guided Exploration , 2008, 2008 IEEE International Conference on Robotics and Automation.

[60]  Daniel H. Grollman,et al.  Sparse incremental learning for interactive robot control policy estimation , 2008, 2008 IEEE International Conference on Robotics and Automation.

[61]  Frank J. Balbach,et al.  Measuring teachability using variants of the teaching dimension , 2008, Theor. Comput. Sci..

[62]  Aniket Kittur,et al.  Crowdsourcing user studies with Mechanical Turk , 2008, CHI.

[63]  Desney S. Tan,et al.  CueFlik: interactive concept learning in image search , 2008, CHI.

[64]  Manuela M. Veloso,et al.  Multi-thresholded approach to demonstration selection for interactive robot learning , 2008, 2008 3rd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[65]  Manuel Lopes,et al.  Learning Object Affordances: From Sensory--Motor Coordination to Imitation , 2008, IEEE Transactions on Robotics.

[66]  Weng-Keen Wong,et al.  Integrating rich user feedback into intelligent user interfaces , 2008, IUI '08.

[67]  Karen L. Myers,et al.  Question Asking to Inform Procedure Learning , 2008 .

[68]  Noah D. Goodman,et al.  Teaching Games : Statistical Sampling Assumptions for Learning in Pedagogical Situations , 2008 .

[69]  Mark Craven,et al.  Active Learning with Real Annotation Costs , 2008 .

[70]  Nando de Freitas,et al.  Active Preference Learning with Discrete Choice Data , 2007, NIPS.

[71]  Maya Cakmak,et al.  To Afford or Not to Afford: A New Formalization of Affordances Toward Affordance-Based Robot Control , 2007, Adapt. Behav..

[72]  Carla E. Brodley,et al.  Active Class Selection , 2007, ECML.

[73]  Oscar Déniz-Suárez,et al.  Learning to Recognize Faces Incrementally , 2007, DAGM-Symposium.

[74]  Chrystopher L. Nehaniv,et al.  6th Ieee International Conference on Robot & Human Interactive Communication Issues in Human/robot Task Structuring and Teaching , 2022 .

[75]  Csaba Szepesvári,et al.  Apprenticeship Learning using Inverse Reinforcement Learning and Gradient Methods , 2007, UAI.

[76]  Manuela M. Veloso,et al.  Confidence-based policy learning from demonstration using Gaussian mixture models , 2007, AAMAS '07.

[77]  Daniel H. Grollman,et al.  Dogged Learning for Robots , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[78]  Aude Billard,et al.  On Learning, Representing, and Generalizing a Task in a Humanoid Robot , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[79]  Aude Billard,et al.  Incremental learning of gestures by imitation in a humanoid robot , 2007, 2007 2nd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[80]  Gergely Csibra,et al.  Teachers in the wild , 2007, Trends in Cognitive Sciences.

[81]  José Santos-Victor,et al.  A unified framework for imitation-like behaviors , 2007 .

[82]  Aude Billard,et al.  What is the Teacher"s Role in Robot Programming by Demonstration? - Toward Benchmarks for Improved Learning , 2007 .

[83]  Marc Toussaint,et al.  Active Learning in Motor Control , 2007 .

[84]  Noah D. Goodman,et al.  Learning and using relational theories : supporting material , 2007 .

[85]  Hema Raghavan,et al.  Active Learning with Feedback on Features and Instances , 2006, J. Mach. Learn. Res..

[86]  Desney S. Tan,et al.  CueTIP: a mixed-initiative interface for correcting handwriting errors , 2006, UIST.

[87]  Paul A. Viola,et al.  Corrective feedback and persistent learning for information extraction , 2006, Artif. Intell..

[88]  Tom M. Mitchell,et al.  Text clustering with extended user feedback , 2006, SIGIR.

[89]  Andrea Lockerd Thomaz,et al.  Reinforcement Learning with Human Teachers: Evidence of Feedback and Guidance with Implications for Learning Performance , 2006, AAAI.

[90]  Thomas Zeugmann,et al.  Teaching Randomized Learners , 2006, COLT.

[91]  Aude Billard,et al.  Discriminative and adaptive imitation in uni-manual and bi-manual tasks , 2006, Robotics Auton. Syst..

[92]  Chrystopher L. Nehaniv,et al.  Teaching robots by moulding behavior and scaffolding the environment , 2006, HRI '06.

[93]  Sanjoy Dasgupta,et al.  Coarse sample complexity bounds for active learning , 2005, NIPS.

[94]  M. Mascolo Change processes in development: The concept of coactive scaffolding , 2005 .

[95]  Andreas Krause,et al.  Near-optimal Nonmyopic Value of Information in Graphical Models , 2005, UAI.

[96]  Andrew McCallum,et al.  Reducing Labeling Effort for Structured Prediction Tasks , 2005, AAAI.

[97]  A. Thomaz Socially Guided Machine Learning : Designing an Algorithm to Learn from Real-Time Human Interaction , 2005 .

[98]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[99]  Carlos Guestrin,et al.  A Note on the Budgeted Maximization of Submodular Functions , 2005 .

[100]  David L. Faigman,et al.  Human category learning. , 2005, Annual review of psychology.

[101]  Jerry Pratt,et al.  Series Elastic Actuators for legged robots , 2004, SPIE Defense + Commercial Sensing.

[102]  Paul A. Viola,et al.  Interactive Information Extraction with Constrained Conditional Random Fields , 2004, AAAI.

[103]  Pieter Abbeel,et al.  Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[104]  Ales Ude,et al.  Programming full-body movements for humanoid robots by observation , 2004, Robotics Auton. Syst..

[105]  Laura A. Dabbish,et al.  Labeling images with a computer game , 2004, AAAI Spring Symposium: Knowledge Collection from Volunteer Contributors.

[106]  E. Hiby,et al.  Dog training methods: their use, effectiveness and interaction with behaviour and welfare , 2004, Animal Welfare.

[107]  Pierre-Yves Oudeyer,et al.  Intelligent Adaptive Curiosity: a source of Self-Development , 2004 .

[108]  Jun Nakanishi,et al.  Learning Movement Primitives , 2005, ISRR.

[109]  Richard Alan Peters,et al.  Robonaut task learning through teleoperation , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).

[110]  Jerry Alan Fails,et al.  A design tool for camera-based interaction , 2003, CHI '03.

[111]  Andrew McCallum,et al.  Semi-Supervised Clustering with User Feedback , 2003 .

[112]  Bruce Blumberg,et al.  Integrated learning for interactive synthetic characters , 2002, SIGGRAPH.

[113]  James L. McClelland,et al.  Success and failure in teaching the [r]-[l] contrast to Japanese adults: Tests of a Hebbian model of plasticity and stabilization in spoken language perception , 2002, Cognitive, affective & behavioral neuroscience.

[114]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[115]  A. Graesser,et al.  PREG: Elements of a Model of Question Asking , 2001 .

[116]  J. Stevenson The cultural origins of human cognition , 2001 .

[117]  Peter Stone,et al.  Cobot: A Social Reinforcement Learning Agent , 2001, NIPS.

[118]  Manuela M. Veloso,et al.  Layered Learning , 2000, ECML.

[119]  Andrew Y. Ng,et al.  Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[120]  Rémi Gilleron,et al.  PAC Learning under Helpful Distributions , 1997, RAIRO Theor. Informatics Appl..

[121]  H. David Mathias,et al.  A Model of Interactive Teaching , 1997, J. Comput. Syst. Sci..

[122]  John Shawe-Taylor,et al.  On Specifying Boolean Functions by Labelled Examples , 1995, Discret. Appl. Math..

[123]  H. Friedrich,et al.  In: Probramming by Demonstration vs. Learning from Examples Workshop at Ml'95 Obtaining Good Performance from a Bad Teacher , 1995 .

[124]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[125]  J. Elman Learning and development in neural networks: the importance of starting small , 1993, Cognition.

[126]  Andrew Tomkins,et al.  A computational model of teaching , 1992, COLT '92.

[127]  H. Sebastian Seung,et al.  Query by committee , 1992, COLT '92.

[128]  T. Caro,et al.  Is There Teaching in Nonhuman Animals? , 1992, The Quarterly Review of Biology.

[129]  Michael Kearns,et al.  On the complexity of teaching , 1991, COLT '91.

[130]  Chih-Cheng Chen,et al.  A combined optimization method for solving the inverse kinematics problems of mechanical manipulators , 1991, IEEE Trans. Robotics Autom..

[131]  E. Wenger,et al.  Situated Learning: Legitimate Peripheral Participation in Communities of Practice , 1991 .

[132]  James N. MacGregor The Effects of Order on Learning Classifications by Example: Heuristics for Finding the Optimal Order , 1988, Artif. Intell..

[133]  Dana Angluin,et al.  Queries and concept learning , 1988, Machine Learning.

[134]  N. Littlestone Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[135]  Vasek Chvátal,et al.  A Greedy Heuristic for the Set-Covering Problem , 1979, Math. Oper. Res..

[136]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[137]  G. Kearsley Questions and question asking in verbal discourse: A cross-disciplinary review , 1976 .

[138]  Herman Chernoff,et al.  The Use of Faces to Represent Points in k- Dimensional Space Graphically , 1973 .

[139]  M. Seligman On the generality of the laws of learning , 1970 .