Using Topic Segmentation Models for the Automatic Organisation of MOOCs resources

As online courses such as MOOCs become increasingly popular, there has been a dramatic increase for the demand for methods to facilitate this type of organisation. While resources for new courses are often freely available, they are generally not suitably organised into easily manageable units. In this paper, we investigate how state-of-the-art topic segmentation models can be utilised to automatically transform unstructured text into coherent sections, which are suitable for MOOCs content browsing. The suitability of this method with regards to course organisation is confirmed through experiments with a lecture corpus, configured explicitly according to MOOCs settings. Experimental results demonstrate the reliability and scalability of this approach over various academic disciplines. The findings also show that the topic segmentation model which used discourse cues displayed the best results overall.

[1]  Hitoshi Isahara,et al.  A Statistical Model for Domain-Independent Text Segmentation , 2001, ACL.

[2]  Xihong Wu,et al.  Text Segmentation with LDA-Based Fisher Kernel , 2008, ACL.

[3]  Marc Moens,et al.  Articles Summarizing Scientific Articles: Experiments with Relevance and Rhetorical Status , 2002, CL.

[4]  Chris Fournier,et al.  Evaluating Text Segmentation using Boundary Edit Distance , 2013, ACL.

[5]  Igor Malioutov,et al.  Minimum Cut Model for Spoken Lecture Segmentation , 2006, ACL.

[6]  Osvaldo Rodriguez The concept of openness behind c and x-MOOCs (Massive Open Online Courses) , 2013 .

[7]  Stephen E. Robertson,et al.  Applying Machine Learning to Text Segmentation for Information Retrieval , 2004, Information Retrieval.

[8]  Chris Biemann,et al.  TopicTiling: A Text Segmentation Algorithm based on LDA , 2012, ACL 2012.

[9]  Regina Barzilay,et al.  Bayesian Unsupervised Topic Segmentation , 2008, EMNLP.

[10]  Thomas L. Griffiths,et al.  Unsupervised Topic Modelling for Multi-Party Spoken Discourse , 2006, ACL.

[11]  Freddy Y. Y. Choi Advances in domain independent linear text segmentation , 2000, ANLP.

[12]  Lindsay Miller,et al.  The teaching of academic listening comprehension and the question of authenticity , 1997 .

[13]  Marti A. Hearst,et al.  A Critique and Improvement of an Evaluation Metric for Text Segmentation , 2002, CL.

[14]  Johanna D. Moore,et al.  Latent Semantic Analysis for Text Segmentation , 2001, EMNLP.

[15]  Jorge Baptista,et al.  Using the Crowd to Annotate Metadiscursive Acts , 2014 .

[16]  John D. Lafferty,et al.  Statistical Models for Text Segmentation , 1999, Machine Learning.

[17]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[18]  Alexander H. Waibel,et al.  Measuring the Structural Importance through Rhetorical Structure Index , 2013, HLT-NAACL.

[19]  Marti A. Hearst Text Tiling: Segmenting Text into Multi-paragraph Subtopic Passages , 1997, CL.

[20]  Lan Du,et al.  Topic Segmentation with a Structured Topic Model , 2013, NAACL.

[21]  Regina Barzilay,et al.  Gestural Cohesion for Topic Segmentation , 2008, ACL.

[22]  Rebecca J. Passonneau,et al.  Discourse Segmentation by Human and Automated Means , 1997, CL.

[23]  Eric Fosler-Lussier,et al.  Discourse Segmentation of Multi-Party Conversation , 2003, ACL.

[24]  T. V. Geetha,et al.  Automatic Organization and Generation of Presentation Slides for E-Learning , 2012, Int. J. Distance Educ. Technol..

[25]  Nitin Madnani,et al.  Identifying High-Level Organizational Elements in Argumentative Discourse , 2012, NAACL.

[26]  Joemon M. Jose,et al.  Text segmentation via topic modeling: an analytical study , 2009, CIKM.