An Empirical Study of Non-Expert Curriculum Design for Machine Learners

Existing machine-learning work has shown that algorithms can benefit from curriculum learning, a strategy where the target behavior of the learner is changed over time. However, most existing work focuses on developing automatic methods to iteratively select training examples with increasing difficulty tailored to the current ability of the learner, neglecting how non-expert humans may design curricula. In this work we introduce a curriculumdesign problem in the context of reinforcement learning and conduct a user study to explicitly explore how non-expert humans go about assembling curricula. We present results from 80 participants on Amazon Mechanical Turk that show 1) humans can successfully design curricula that gradually introduce more complex concepts to the agent within each curriculum, and even across different curricula, and 2) users choose to add task complexity in different ways and follow salient principles when selecting tasks into the curriculum. This work serves as an important first step towards better integration of non-expert humans into the reinforcement learning process and the development of new machine learning algorithms to accommodate human teaching strategies.

[1]  L. S. Vygotskiĭ,et al.  Mind in society : the development of higher psychological processes , 1978 .

[2]  John Cocke,et al.  A Statistical Approach to Machine Translation , 1990, CL.

[3]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[4]  Sebastian Thrun,et al.  Is Learning The n-th Thing Any Easier Than Learning The First? , 1995, NIPS.

[5]  Patrick J. Roa Volume 8 , 2001 .

[6]  Andrea Lockerd Thomaz,et al.  Reinforcement Learning with Human Teachers: Evidence of Feedback and Guidance with Implications for Learning Performance , 2006, AAAI.

[7]  Raymond J. Mooney,et al.  Using Active Relocation to Aid Reinforcement Learning , 2006, FLAIRS.

[8]  Alan Fern,et al.  Multi-task reinforcement learning: a hierarchical Bayesian approach , 2007, ICML '07.

[9]  Peter Stone,et al.  Transfer Learning via Inter-Task Mappings for Temporal Difference Learning , 2007, J. Mach. Learn. Res..

[10]  Richard S. Sutton,et al.  On the role of tracking in stationary environments , 2007, ICML '07.

[11]  Zoubin Ghahramani,et al.  Proceedings of the 24th international conference on Machine learning , 2007, ICML 2007.

[12]  Peter Stone,et al.  Interactively shaping agents via human reinforcement: the TAMER framework , 2009, K-CAP '09.

[13]  Peter Stone,et al.  Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..

[14]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[15]  Matthew E. Taylor Assisting Transfer-Enabled Machine Learning Algorithms: Leveraging Human Knowledge for Curriculum Design , 2009, AAAI Spring Symposium: Agents that Learn from Human Teachers.

[16]  Daphne Koller,et al.  Self-Paced Learning for Latent Variable Models , 2010, NIPS.

[17]  Andrew G. Barto,et al.  Intrinsically Motivated Hierarchical Skill Learning in Structured Environments , 2010, IEEE Transactions on Autonomous Mental Development.

[18]  Yong Jae Lee,et al.  Learning the easy things first: Self-paced visual category discovery , 2011, CVPR 2011.

[19]  Andrea Lockerd Thomaz,et al.  Policy Shaping: Integrating Human Feedback with Reinforcement Learning , 2013, NIPS.

[20]  David L. Roberts,et al.  Training an Agent to Ground Commands with Reward and Punishment , 2014, AAAI 2014.

[21]  Nick Cercone,et al.  Computational Linguistics , 1986, Communications in Computer and Information Science.

[22]  David L. Roberts,et al.  Learning behaviors via human-delivered discrete feedback: modeling implicit feedback strategies to speed up learning , 2015, Autonomous Agents and Multi-Agent Systems.

[23]  Michael Luck,et al.  Proceedings of the 15th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2016 , 2016, AAMAS 2016.

[24]  Peter Stone,et al.  Source Task Creation for Curriculum Learning , 2016, AAMAS.