Weakly Supervised Learning of Dialogue Structure in MOOC Forum Threads

In this paper we present a new method for understanding discussions between students in MOOC forums. In particular, we introduce a machine learning method for discovering instances in which a response relation exists between a pair of posts in a forum thread, for example when one student provides the answer to a question or comments on something another student previously said. Research has shown that understanding conversational structure between students is paramount to evaluating the productivity of the collaboration and estimating outcomes. However, previous methods often rely on human supplied dialogue act labels or discourse parsing algorithms requiring large labeled datasets. Our method, which utilizes a fast, exact optimization process known as spectral optimization, does not require manually annotated training data and is highly scalable and generalizable. Empirical results are given using real world datasets consisting of conversations between students participating in Coursera courses, and we see predictive accuracy above 90% - nearing the human inter-annotator agreement rate for these datasets.

[1]  Livio Robaldo,et al.  The Penn Discourse TreeBank 2.0. , 2008, LREC.

[2]  Gabriele Musillo,et al.  Unlexicalised Hidden Variable Models of Split Dependency Grammars , 2008, ACL.

[3]  Graeme Hirst,et al.  Text-level Discourse Parsing with Rich Linguistic Features , 2012, ACL.

[4]  Byron Boots,et al.  Closing the learning-planning loop with predictive state representations , 2009, Int. J. Robotics Res..

[5]  Dean Alderucci A SPECTRAL ALGORITHM FOR LEARNING HIDDEN MARKOV MODELS THAT HAVE SILENT STATES , 2015 .

[6]  Le Song,et al.  A Spectral Algorithm for Latent Tree Graphical Models , 2011, ICML.

[7]  Carolyn Penstein Rosé,et al.  Predicting Student Learning from Conversational Cues , 2014, Intelligent Tutoring Systems.

[8]  Carolyn Penstein Rosé,et al.  Hierarchical Conversation Structure Prediction in Multi-Party Chat , 2012, SIGDIAL Conference.

[9]  Anima Anandkumar,et al.  A Spectral Algorithm for Latent Dirichlet Allocation , 2012, Algorithmica.

[10]  Byron Boots,et al.  Predictive State Temporal Difference Learning , 2010, NIPS.

[11]  Zachary A. Pardos,et al.  A Spectral Learning Approach to Knowledge Tracing , 2013, EDM.

[12]  Graeme Hirst,et al.  A Linear-Time Bottom-Up Discourse Parser with Constraints and Post-Editing , 2014, ACL.

[13]  Carolyn Penstein Rosé,et al.  Towards Identifying the Resolvability of Threads in MOOCs , 2014, EMNLP 2014.

[14]  Reid G. Simmons,et al.  Spectral Semi-Supervised Discourse Relation Classification , 2015, ACL.

[15]  Arthur C. Graesser,et al.  Domain Independent Assessment of Dialogic Properties of Classroom Discourse , 2014, EDM.