Learning Data Teaching Strategies Via Knowledge Tracing

Teaching plays a fundamental role in human learning. Typically, a human teaching strategy would involve assessing a student’s knowledge progress for tailoring the teaching materials in a way that enhances the learning progress. A human teacher would achieve this by tracing a student’s knowledge over important learning concepts in a task. Albeit, such teaching strategy is not well exploited yet in machine learning as current machine teaching methods tend to directly assess the progress on individual training samples without paying attention to the underlying learning concepts in a learning task. In this paper, we propose a novel method, called Knowledge Augmented Data Teaching (KADT), which can optimize a data teaching strategy for a student model by tracing its knowledge progress over multiple learning concepts in a learning task. Specifically, the KADT method incorporates a knowledge tracing model to dynamically capture the knowledge progress of a student model in terms of latent learning concepts. Then we develop an attention pooling mechanism to distill knowledge representations of a student model with respect to class labels, which enables to develop a data teaching strategy on critical training samples. We have evaluated the performance of the KADT method on four different machine learning tasks including knowledge tracing, sentiment analysis, movie recommendation, and image classification. The results comparing to the state-of-the-art methods empirically validate that KADT consistently outperforms others on all tasks.

[1]  Valentin I. Spitkovsky,et al.  From Baby Steps to Leapfrog: How “Less is More” in Unsupervised Dependency Parsing , 2010, NAACL.

[2]  Alex Graves,et al.  Automated Curriculum Learning for Neural Networks , 2017, ICML.

[3]  Gabriel J. Brostow,et al.  Becoming the expert - interactive multi-class machine teaching , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Dit-Yan Yeung,et al.  Dynamic Key-Value Memory Networks for Knowledge Tracing , 2016, WWW.

[5]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[6]  Yong Jae Lee,et al.  Learning the easy things first: Self-paced visual category discovery , 2011, CVPR 2011.

[7]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[8]  Shai Shalev-Shwartz,et al.  On Graduated Optimization for Stochastic Non-Convex Problems , 2015, ICML.

[9]  R. Bellman A Markovian Decision Process , 1957 .

[10]  Yoshua Bengio,et al.  FitNets: Hints for Thin Deep Nets , 2014, ICLR.

[11]  Daphne Koller,et al.  Self-Paced Learning for Latent Variable Models , 2010, NIPS.

[12]  Qing Wang,et al.  Knowledge Tracing with Sequential Key-Value Memory Networks , 2019, SIGIR.

[13]  Lars Schmidt-Thieme,et al.  Online-updating regularized kernel matrix factorization models for large-scale recommender systems , 2008, RecSys '08.

[14]  Ausif Mahmood,et al.  Deep Learning approach for sentiment analysis of short texts , 2017, 2017 3rd International Conference on Control, Automation and Robotics (ICCAR).

[15]  Paul Barford,et al.  Explicit Defense Actions Against Test-Set Attacks , 2017, AAAI.

[16]  Leonidas J. Guibas,et al.  Deep Knowledge Tracing , 2015, NIPS.

[17]  Scott Niekum,et al.  Machine Teaching for Inverse Reinforcement Learning: Algorithms and Applications , 2018, AAAI.

[18]  J. Stenton,et al.  Learning how to teach. , 1973, Nursing mirror and midwives journal.

[19]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[20]  Lijun Wu,et al.  Learning to Teach with Dynamic Loss Functions , 2018, NeurIPS.

[21]  Alex Graves,et al.  Neural Turing Machines , 2014, ArXiv.

[22]  Christopher Potts,et al.  Learning Word Vectors for Sentiment Analysis , 2011, ACL.

[23]  R Bellman,et al.  DYNAMIC PROGRAMMING AND STATISTICAL COMMUNICATION THEORY. , 1957, Proceedings of the National Academy of Sciences of the United States of America.

[24]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[25]  Pierre-Yves Oudeyer,et al.  Intrinsic Motivation Systems for Autonomous Mental Development , 2007, IEEE Transactions on Evolutionary Computation.

[26]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[27]  Max Welling,et al.  Attention-based Deep Multiple Instance Learning , 2018, ICML.

[28]  Le Song,et al.  Iterative Machine Teaching , 2017, ICML.

[29]  Shiguang Shan,et al.  Self-Paced Learning with Diversity , 2014, NIPS.

[30]  Xiaojin Zhu,et al.  Machine Teaching for Bayesian Learners in the Exponential Family , 2013, NIPS.

[31]  Xiaojin Zhu,et al.  The Teaching Dimension of Linear Learners , 2015, ICML.

[32]  Jürgen Schmidhuber,et al.  Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990–2010) , 2010, IEEE Transactions on Autonomous Mental Development.

[33]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[34]  Xiaojin Zhu,et al.  Machine Teaching: An Inverse Problem to Machine Learning and an Approach Toward Optimal Education , 2015, AAAI.