Imputing KCs with Representations of Problem Content and Context

Cognitive task analysis is a laborious process made more onerous in educational platforms where many problems are user created and mostly left without identified knowledge components. Past approaches to this issue of untagged problems have centered around text mining to impute knowledge components (KC). In this work, we advance KC imputation research by modeling both the content (text) of a problem as well as the context (problems around it) using a novel application of skip-gram based representation learning applied to tens of thousands of student response sequences from the ASSISTments 2012 public dataset. We find that there is as much information in the contextual representation as the content representation, with the combination of sources of information leading to a 90% accuracy in predicting the missing skill from a KC model of 198. This work underscores the value of considering problems in context for the KC prediction task and has broad implications for its use with other modeling objectives such as KC model improvement.

[1]  Michel C. Desmarais,et al.  Mapping question items to skills with non-negative matrix factorization , 2012, SKDD.

[2]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[3]  Kevin Gimpel,et al.  Tailoring Continuous Word Representations for Dependency Parsing , 2014, ACL.

[4]  Quoc V. Le,et al.  Exploiting Similarities among Languages for Machine Translation , 2013, ArXiv.

[5]  Kenneth R. Koedinger,et al.  Learning Factors Analysis - A General Method for Cognitive Model Evaluation and Improvement , 2006, Intelligent Tutoring Systems.

[6]  Anthony E. Kelly,et al.  DIAGNOSING KNOWLEDGE STATES IN ALGEBRA USING THE RULE SPACE MODEL , 1992 .

[7]  Cícero Nogueira dos Santos,et al.  Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts , 2014, COLING.

[8]  Zachary A. Pardos,et al.  The School of Information and its relationship to computer science at UC Berkeley , 2017 .

[9]  Oren Barkan,et al.  ITEM2VEC: Neural item embedding for collaborative filtering , 2016, 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP).

[10]  Ehsaneddin Asgari,et al.  Continuous Distributed Representation of Biological Sequences for Deep Proteomics and Genomics , 2015, PloS one.

[11]  Zachary A. Pardos,et al.  Knowledge Component Suggestion for Untagged Content in an Intelligent Tutoring System , 2012, ITS.

[12]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[13]  Leonidas J. Guibas,et al.  Deep Knowledge Tracing , 2015, NIPS.

[14]  John R. Anderson,et al.  Knowledge tracing: Modeling the acquisition of procedural knowledge , 2005, User Modeling and User-Adapted Interaction.

[15]  S. Hsi,et al.  Crowdsourcing and Curating Online Education Resources , 2013, Science.

[16]  Marc'Aurelio Ranzato,et al.  DeViSE: A Deep Visual-Semantic Embedding Model , 2013, NIPS.

[17]  Yun Huang Deeper Knowledge Tracing by Modeling Skill Application Context for Better Personalized Learning , 2016, UMAP.

[18]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[19]  Geoffrey Zweig,et al.  Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[20]  Neil T. Heffernan,et al.  Automatic and Semi-Automatic Skill Coding With a View Towards Supporting On-Line Assessment , 2005, AIED.