Learner Behavioral Feature Refinement and Augmentation Using GANs

Learner behavioral data (e.g., clickstream activity logs) collected by online education platforms contains rich information about learners and content, but is often highly redundant. In this paper, we study the problem of learning low-dimensional, interpretable features from this type of raw, high-dimensional behavioral data. Based on the premise of generative adversarial networks (GANs), our method refines a small set of human-crafted features while also generating a set of additional, complementary features that better summarize the raw data. Through experimental validation on a real-world dataset that we collected from an online course, we demonstrate that our method leads to features that are both predictive of learner quiz scores and closely related to human-crafted features.

[1]  Aditya Johri,et al.  Acting the Same Differently: A Cross-Course Comparison of User Behavior in MOOCs , 2016, EDM.

[2]  Justin Reich,et al.  The Civic Mission of MOOCs: Measuring Engagement across Political Differences in Forums , 2016, L@S.

[3]  Mung Chiang,et al.  Behavior-Based Grade Prediction for MOOCs Via Time Series Neural Networks , 2017, IEEE Journal of Selected Topics in Signal Processing.

[4]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[5]  Sherif A. Halawa,et al.  Dropout Prediction in MOOCs using Learner Activity Features , 2014 .

[6]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[7]  Carolyn Penstein Rosé,et al.  Exploring the Effect of Student Confusion in Massive Open Online Courses , 2016, EDM.

[8]  Mung Chiang,et al.  Behavior in social learning networks: Early detection for online short-courses , 2017, IEEE INFOCOM 2017 - IEEE Conference on Computer Communications.

[9]  H. Vincent Poor,et al.  Social learning networks: Efficiency optimization for MOOC forums , 2016, IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications.

[10]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[11]  Sankar K. Pal,et al.  Multilayer perceptron, fuzzy sets, and classification , 1992, IEEE Trans. Neural Networks.

[12]  Mung Chiang,et al.  Behavior-Based Latent Variable Model for Learner Engagement , 2017, EDM.

[13]  Laura K. Allen,et al.  {ENTER}ing the Time Series {SPACE}: Uncovering the Writing Process through Keystroke Analyses , 2016, EDM.

[14]  Tiffany Barnes,et al.  The Q-matrix Method: Mining Student Response Data for Knowledge , 2005 .

[15]  Yoav Bergner,et al.  Model-Based Collaborative Filtering Analysis of Student Response Data: Machine-Learning Item Response Theory , 2012, EDM.

[16]  Isaac L. Chuang,et al.  Probabilistic Use Cases: Discovering Behavioral Patterns for Predicting Certification , 2015, L@S.

[17]  Markus H. Gross,et al.  Efficient Feature Embeddings for Student Classification with Variational Auto-encoders , 2017, EDM.

[18]  Irena Koprinska,et al.  Mining behaviors of students in autograding submission system logs , 2016, EDM.

[19]  Neil T. Heffernan,et al.  Semantic Features of Math Problems: Relationships to Student Learning and Engagement , 2016, EDM.

[20]  Richard G. Baraniuk,et al.  Tag-Aware Ordinal Sparse Factor Analysis for Learning and Content Analytics , 2014, EDM.

[21]  Philip J. Guo,et al.  How video production affects student engagement: an empirical study of MOOC videos , 2014, L@S.

[22]  DeLiang Wang,et al.  Unsupervised Learning: Foundations of Neural Computation , 2001, AI Mag..

[23]  Kangwook Lee,et al.  Learning analytics: Collaborative filtering or regression with experts? , 2016, NIPS 2016.

[24]  H. Vincent Poor,et al.  Mining MOOC Clickstreams: Video-Watching Behavior vs. In-Video Quiz Performance , 2016, IEEE Transactions on Signal Processing.

[25]  Mung Chiang,et al.  MOOC performance prediction via clickstream data and social learning networks , 2015, 2015 IEEE Conference on Computer Communications (INFOCOM).

[26]  Xin Li,et al.  Riding an emotional roller-coaster: A multimodal study of young child's math problem solving activities , 2016, EDM.

[27]  Michel C. Desmarais,et al.  Methods to find the number of latent skills , 2012, EDM.

[28]  Jure Leskovec,et al.  Engaging with massive online courses , 2014, WWW.

[29]  Kenneth R. Koedinger,et al.  Learning is Not a Spectator Sport: Doing is Better than Watching for Learning from a MOOC , 2015, L@S.

[30]  L. Getoor,et al.  Predicting Post-Test Performance from Online Student Behavior: A High School MOOC Case Study , 2016 .

[31]  Lise Getoor,et al.  Predicting Post-Test Performance from Student Behavior: A High School MOOC Case Study , 2016, EDM.

[32]  Chris Piech,et al.  Deconstructing disengagement: analyzing learner subpopulations in massive open online courses , 2013, LAK '13.

[33]  S. Menard Applied Logistic Regression Analysis , 1996 .

[34]  John R. Anderson,et al.  Knowledge tracing: Modeling the acquisition of procedural knowledge , 2005, User Modeling and User-Adapted Interaction.