Unsupervised modeling for understanding MOOC discussion forums: a learning analytics approach

Massively Open Online Courses (MOOCs) have gained attention recently because of their great potential to reach learners. Substantial empirical study has focused on student persistence and their interactions with the course materials. However, most MOOCs include a rich textual dialogue forum, and these textual interactions are largely unexplored. Automatically understanding the nature of discussion forum posts holds great promise for providing adaptive support to individual students and to collaborative groups. This paper presents a study that applies unsupervised student understanding models originally developed for synchronous tutorial dialogue to MOOC forums. We use a clustering approach to group similar posts, compare the clusters with manual annotations by MOOC researchers, and further investigate clusters qualitatively. This paper constitutes a step toward applying unsupervised models to asynchronous communication, which can enable massive-scale automated discourse analysis and mining to better support students' learning.

[1]  Geo. J. Manson Success , 1898, The American journal of dental science.

[2]  Jiawei Han,et al.  Efficient and Effective Clustering Methods for Spatial Data Mining , 1994, VLDB.

[3]  Charlotte N. Gunawardena,et al.  Analysis of a Global Online Debate and the Development of an Interaction Analysis Model for Examining Social Construction of Knowledge in Computer Conferencing , 1997 .

[4]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[5]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[6]  Stuart R. Palmer,et al.  Does the discussion help? The impact of a formally assessed online discussion on final student results , 2008, Br. J. Educ. Technol..

[7]  Tiffany Barnes,et al.  Unsupervised MDP Value Selection for Automating ITS Capabilities. , 2009, EDM 2009.

[8]  Alan Ritter,et al.  Unsupervised Modeling of Twitter Conversations , 2010, NAACL.

[9]  Arthur C. Graesser,et al.  Automatic Discovery of Speech Act Categories in Educational Games , 2012, EDM.

[10]  Doug Clow,et al.  MOOCs and the funnel of participation , 2013, LAK '13.

[11]  Rebecca Ferguson,et al.  An evaluation of learning analytics to identify exploratory dialogue in online discussions , 2013, LAK '13.

[12]  Kristy Elizabeth Boyer,et al.  Unsupervised Classification of Student Dialogue Acts with Query-Likelihood Clustering , 2013, EDM.

[13]  Chris Piech,et al.  Deconstructing disengagement: analyzing learner subpopulations in massive open online courses , 2013, LAK '13.

[14]  Mohamed Medhat Gaber,et al.  Automatic Content Related Feedback for MOOCs Based on Course Domain Ontology , 2014, IDEAL.

[15]  Peter Brusilovsky,et al.  Investigating Automated Student Modeling in a Java MOOC , 2014, EDM.

[16]  Carolyn Penstein Rosé,et al.  Sentiment Analysis in MOOC Discussion Forums: What does it tell us? , 2014, EDM.

[17]  Jure Leskovec,et al.  Engaging with massive online courses , 2014, WWW.

[18]  Kristy Elizabeth Boyer,et al.  Toward Adaptive Unsupervised Dialogue Act Classification in Tutoring by Gender and Self-Efficacy , 2014, EDM.

[19]  Jihie Kim,et al.  Capturing Difficulty Expressions in Student Online Q&A Discussions , 2014, AAAI.

[20]  Lise Getoor,et al.  Understanding MOOC Discussion Forums using Seeded LDA , 2014, BEA@ACL.

[21]  Kevin Oliver,et al.  A Social Network Perspective on Peer Supported Learning in MOOCs for Educators , 2014 .

[22]  Erik Duval,et al.  Success, activity and drop-outs in MOOCs an exploratory study on the UNED COMA courses , 2014, LAK.

[23]  Linda Corrin,et al.  Visualizing patterns of student engagement and performance in MOOCs , 2014, LAK.

[24]  Daniel T. Hickey,et al.  Small to big before massive: scaling up participatory learning analytics , 2014, LAK '14.