Learning about Social Learning in MOOCs: From Statistical Analysis to Generative Model

We study user behavior in the courses offered by a major massive online open course (MOOC) provider during the summer of 2013. Since social learning is a key element of scalable education on MOOC and is done via online discussion forums, our main focus is on understanding forum activities. Two salient features of these activities drive our research: (1) high decline rate: for each course studied, the volume of discussion declined continuously throughout the duration of the course; (2) high-volume, noisy discussions: at least 30 percent of the courses produced new threads at rates that are infeasible for students or teaching staff to read through. Further, a substantial portion of these discussions are not directly course-related. In our analysis, we investigate factors that are associated with the decline of activity on MOOC forums, and we find effective strategies to classify threads and rank their relevance. Specifically, we first use linear regression models to analyze the forum activity count data over time, and make a number of observations; for instance, the teaching staff's active participation in the discussions is correlated with an increase in the discussion volume but does not slow down the decline rate. We then propose a unified generative model for the discussion threads, which allows us both to choose efficient thread classifiers and to design an effective algorithm for ranking thread relevance. Further, our algorithm is compared against two baselines using human evaluation from Amazon Mechanical Turk.

[1]  Terje Väljataga,et al.  Open Online Courses: Responding to Design Challenges , 2011 .

[2]  Paul Bouchard,et al.  Some Factors to Consider when Designing Semi-Autonomous Learning Environments. , 2009 .

[3]  Wei Chen,et al.  Participation Maximization Based on Social Influence in Online Discussion Forums , 2011, ICWSM.

[4]  Conor Hayes,et al.  Cross-Community Influence in Discussion Fora , 2012, ICWSM.

[5]  Chris Callison-Burch,et al.  Creating Speech and Language Data With Amazon’s Mechanical Turk , 2010, Mturk@HLT-NAACL.

[6]  Peter Harrington,et al.  Machine Learning in Action , 2012 .

[7]  Jim Hewitt,et al.  Online class size, note reading, note writing and collaborative discourse , 2012, Int. J. Comput. Support. Collab. Learn..

[8]  F. Maxwell Harper,et al.  Facts or friends?: distinguishing informational and conversational questions in social Q&A sites , 2009, CHI.

[9]  Karen Swan,et al.  Building Learning Communities in Online Courses: the importance of interaction , 2002 .

[10]  Michael Sean Gallagher,et al.  Exploring the MOOC format as a pedagogical approach for mLearning , 2011 .

[11]  Krishna P. Gummadi,et al.  Measuring User Influence in Twitter: The Million Follower Fallacy , 2010, ICWSM.

[12]  Arpita Ghosh,et al.  Incentivizing participation in online forums for education , 2013, EC '13.

[13]  David C. Parkes,et al.  Preference Elicitation For General Random Utility Models , 2013, UAI.

[14]  Rita Kop,et al.  The Challenges to Connectivist Learning on Open Online Networks: Learning Experiences during a Massive Open Online Course , 2011 .

[15]  Justin Cheng,et al.  Tools for predicting drop-off in large online classes , 2013, CSCW '13.

[16]  Zhenghao Chen,et al.  Tuned Models of Peer Assessment in MOOCs , 2013, EDM.

[17]  Tammy Schellens,et al.  Content analysis schemes to analyze transcripts of online asynchronous discussion groups: A review , 2006, Comput. Educ..

[18]  Lada A. Adamic,et al.  Knowledge sharing and yahoo answers: everyone knows something , 2008, WWW.

[19]  Brian S. Butler,et al.  The Dynamics of Open, Peer-to-Peer Learning: What Factors Influence Participation in the P2P University? , 2013, 2013 46th Hawaii International Conference on System Sciences.

[20]  Chen Wu,et al.  Automatically Measuring the Quality of User Generated Content in Forums , 2011, Australasian Conference on Artificial Intelligence.

[21]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[22]  George D. Kuh,et al.  What Student Affairs Professionals Need to Know About Student Engagement , 2009 .

[23]  Chris Piech,et al.  Deconstructing disengagement: analyzing learner subpopulations in massive open online courses , 2013, LAK '13.

[24]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[25]  Hendrik Blockeel,et al.  Web mining research: a survey , 2000, SKDD.

[26]  Thorsten Joachims,et al.  Making large-scale support vector machine learning practical , 1999 .

[27]  Andrew K. Lui,et al.  An Evaluation of Automatic Text Categorization in Online Discussion Analysis , 2007, Seventh IEEE International Conference on Advanced Learning Technologies (ICALT 2007).

[28]  Eugene Agichtein,et al.  Discovering authorities in question answer communities by using link analysis , 2007, CIKM '07.

[29]  Mark S. Ackerman,et al.  Expertise networks in online communities: structure and algorithms , 2007, WWW '07.

[30]  David C. Parkes,et al.  Designing incentives for online question and answer forums , 2009, EC '09.

[31]  Natasa Milic-Frayling,et al.  Socializing or knowledge sharing?: characterizing social intent in community question answering , 2009, CIKM.

[32]  Jure Leskovec,et al.  Discovering value from community activity on focused question answering sites: a case study of stack overflow , 2012, KDD.