Text stream mining for Massive Open Online Courses: review and perspectives

Massive Open Online Course (MOOC) systems have recently received significant recognition and are increasingly attracting the attention of education providers and educational researchers. MOOCs are neither precisely defined nor sufficiently researched in terms of their properties and usage. The large number of students enrolled in these courses can lead to insufficient feedback given to the students. A stream of student posts to courses’ forums makes the problem even more difficult. Students’–MOOCs’ interactions can be exploited using text mining techniques to enhance learning and personalise the learners’ experience. In this paper, the open issues in MOOCs are outlined. Text mining and streaming text mining techniques which can contribute to the success of these systems are reviewed and some open issues in MOOC systems are addressed. Finally, our vision of an intelligent personalised MOOC feedback management system that we term iMOOC is outlined.

[1]  Mohammed J. Zaki Scalable Algorithms for Association Mining , 2000, IEEE Trans. Knowl. Data Eng..

[2]  Jian Yin,et al.  Clustering Text Data Streams , 2008, Journal of Computer Science and Technology.

[3]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[4]  ChengXiang Zhai,et al.  Statistical Language Models for Information Retrieval , 2008, NAACL.

[5]  Moshe Y. Vardi Will MOOCs destroy academia? , 2012, CACM.

[6]  Philip S. Yu,et al.  A Framework for Clustering Evolving Data Streams , 2003, VLDB.

[7]  Steve Cooper,et al.  Reflections on Stanford's MOOCs , 2013, CACM.

[8]  Inderjit S. Dhillon,et al.  Co-clustering documents and words using bipartite spectral graph partitioning , 2001, KDD '01.

[9]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[10]  L. Jing Survey of Text Clustering , 2005 .

[11]  James G. Mazoué,et al.  The MOOC Model: Challenging Traditional Education , 2014 .

[12]  Charu C. Aggarwal,et al.  A Survey of Text Clustering Algorithms , 2012, Mining Text Data.

[13]  David R. Karger,et al.  Scatter/Gather: a cluster-based approach to browsing large document collections , 1992, SIGIR '92.

[14]  Qi He,et al.  Keep It Simple with Time: A Reexamination of Probabilistic Topic Detection Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  James Allan,et al.  Taking Topic Detection From Evaluation to Practice , 2005, Proceedings of the 38th Annual Hawaii International Conference on System Sciences.

[16]  Alvaro Barreiro,et al.  Winnowing-based text clustering , 2008, CIKM '08.

[17]  Sudipto Guha,et al.  Streaming-data algorithms for high-quality clustering , 2002, Proceedings 18th International Conference on Data Engineering.

[18]  Hanna M. Wallach,et al.  Topic modeling: beyond bag-of-words , 2006, ICML.

[19]  Martin Ester,et al.  Frequent term-based text clustering , 2002, KDD.

[20]  Charu C. Aggarwal,et al.  Mining Text Streams , 2012, Mining Text Data.

[21]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[22]  Fionn Murtagh,et al.  Algorithms for hierarchical clustering: an overview , 2012, WIREs Data Mining Knowl. Discov..

[23]  Danielle S. McNamara,et al.  Handbook of latent semantic analysis , 2007 .

[24]  Lisa C. Kaczmarczyk MOO CS! , 2013, INROADS.

[25]  Martin Weller,et al.  The Digital Scholar: How Technology is Changing Academic Practice , 2011 .

[26]  David M. Blei,et al.  Probabilistic topic models , 2012, Commun. ACM.

[27]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[28]  Jack Minker,et al.  An Analysis of Some Graph Theoretical Cluster Techniques , 1970, JACM.

[29]  George Karypis,et al.  A Comparison of Document Clustering Techniques , 2000 .

[30]  Shonali Krishnaswamy,et al.  Mining data streams: a review , 2005, SGMD.

[31]  Tanja Schultz,et al.  Correlated Bigram LSA for Unsupervised Language Model Adaptation , 2008, NIPS.

[32]  Mohamed Medhat Gaber,et al.  Automatic Content Related Feedback for MOOCs Based on Course Domain Ontology , 2014, IDEAL.

[33]  Martin Weller,et al.  The digital scholar : how technology is transforming scholarly practice , 2011 .

[34]  Bing Liu,et al.  Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data , 2006, Data-Centric Systems and Applications.

[35]  Soon Myoung Chung,et al.  Text document clustering based on neighbors , 2009, Data Knowl. Eng..

[36]  Daniela Rus,et al.  The Star Clustering Algorithm for Information Organization , 2006, Grouping Multidimensional Data.

[37]  Ian H. Witten,et al.  Data mining - practical machine learning tools and techniques, Second Edition , 2005, The Morgan Kaufmann series in data management systems.

[38]  Paul Hyman In the year of disruptive education , 2012, CACM.

[39]  Daniel Shawcross Wilkerson,et al.  Winnowing: local algorithms for document fingerprinting , 2003, SIGMOD '03.

[40]  Helene Fournier,et al.  A pedagogy of abundance or a pedagogy to support human beings? Participant support on massive open online courses , 2011 .

[41]  Charu C. Aggarwal,et al.  An Introduction to Social Network Data Analytics , 2011, Social Network Data Analytics.

[42]  J. Daniel,et al.  Making Sense of MOOCs : Musings in a Maze of Myth , Paradox and Possibility Author : , 2013 .

[43]  David E. Pritchard,et al.  Studying Learning in the Worldwide Classroom Research into edX's First MOOC. , 2013 .