A Sequence Data Model for Analyzing Temporal Patterns of Student Data

Data models built for analyzing student data often obfuscate temporal relationships for reasons of simplicity, or to aid in generalization. We present a model based on temporal relationships of heterogeneous data as the basis for building predictive models. We show how within- and between-semester temporal patterns can provide insight into the student experience. For example, in a within-semester model, the prediction of the final course grade can be based on weekly activities and submissions recorded in the LMS. In the between-semester model, the prediction of success or failure in a degree program can be based on sequence patterns of grades and activities across multiple semesters. The benefits of our sequence data model include temporal structure, segmentation, contextualization, and storytelling. To demonstrate these benefits, we have collected and analyzed 10 years of student data from the College of Computing at UNC Charlotte in a between-semester sequence model, and used data in an introductory course in computer science to build a within-semester sequence model. Our results for the two sequence models show that analytics based on the sequence data model can achieve higher predictive accuracy than non-temporal models with the same data.

[1]  John Patrick Campbell Utilizing student data within the course management system to determine undergraduate student academic success: An exploratory study , 2007 .

[2]  Ricardo J. G. B. Campello,et al.  Density-Based Clustering Based on Hierarchical Density Estimates , 2013, PAKDD.

[3]  Alejandro Peña Ayala,et al.  Educational data mining: A survey and a data mining-based analysis of recent works , 2014, Expert Syst. Appl..

[4]  Ramesh Nallapati,et al.  Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora , 2009, EMNLP.

[5]  Sebastián Ventura,et al.  Data mining in course management systems: Moodle case study and tutorial , 2008, Comput. Educ..

[6]  David M. Blei,et al.  Probabilistic topic models , 2012, Commun. ACM.

[7]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[8]  Eitel J. M. Lauría,et al.  Early Alert of Academically At-Risk Students: An Open Source Analytics Initiative , 2014, J. Learn. Anal..

[9]  Erkan Er Identifying At-Risk Students Using Machine Learning Techniques: A Case Study with IS 100 , 2012 .

[10]  Sebastián Ventura,et al.  Educational data mining: A survey from 1995 to 2005 , 2007, Expert Syst. Appl..

[11]  Alejandro Peña-Ayala Review: Educational data mining: A survey and a data mining-based analysis of recent works , 2014 .

[12]  Zdenek Zdráhal,et al.  Improving retention: predicting at-risk students by analysing clicking behaviour in a virtual learning environment , 2013, LAK '13.

[13]  Das Amrita,et al.  Mining Association Rules between Sets of Items in Large Databases , 2013 .

[14]  Balaji Padmanabhan,et al.  Unexpectedness as a Measure of Interestingness in Knowledge Discovery , 1999, Decis. Support Syst..

[15]  Zaidatun Tasir,et al.  Educational data mining: A review , 2013 .

[16]  Shane Dawson,et al.  Mining LMS data to develop an "early warning system" for educators: A proof of concept , 2010, Comput. Educ..

[17]  Kimberly E. Arnold Signals: Applying Academic Analytics. , 2010 .

[18]  Geoff Hulten,et al.  Mining time-changing data streams , 2001, KDD '01.

[19]  Matthew D. Pistilli,et al.  Course signals at Purdue: using learning analytics to increase student success , 2012, LAK.

[20]  John P. Campbell,et al.  Academic Analytics: A New Tool for a New Era. , 2007 .

[21]  Gerhard Widmer,et al.  Learning in the presence of concept drift and hidden contexts , 2004, Machine Learning.

[22]  J. Pennebaker,et al.  The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods , 2010 .

[23]  Sebastián Ventura,et al.  Educational Data Mining: A Review of the State of the Art , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[24]  Rayid Ghani,et al.  A Machine Learning Framework to Identify Students at Risk of Adverse Academic Outcomes , 2015, KDD.