Measuring student's proficiency in MOOCs: multiple attempts extensions for the Rasch model

Popularity of online courses with open access and unlimited student participation, the so-called massive open online courses (MOOCs), has been growing intensively. Students, professors, and universities have an interest in accurate measures of students' proficiency in MOOCs. However, these measurements face several challenges: (a) assessments are dynamic: items can be added, removed or replaced by a course author at any time; (b) students may be allowed to make several attempts within one assessment; (c) assessments may include an insufficient number of items for accurate individual-level conclusions. Therefore, common psychometric models and techniques of Classical Test Theory (CTT) and Item Response Theory (IRT) do not serve perfectly to measure proficiency. In this study we try to cover this gap and propose cross-classification multilevel logistic extensions of the common IRT model, the Rasch model, aimed at improving the assessment of the student's proficiency by modeling the effect of attempts and by involving non-assessment data such as student's interaction with video lectures and practical tasks. We illustrate these extensions on the logged data from one MOOC and check the quality using a cross-validation procedure on three MOOCs. We found that (a) the performance changes over attempts depend on the student: whereas for some students performance ameliorates, for other students, the performance might deteriorate; (b) similarly, the change over attempts varies over items; (c) student's activity with video lectures and practical tasks are significant predictors of response correctness in a sense of higher activity leads to higher chances of a correct response; (d) overall accuracy of prediction of student's item responses using the extensions is 6% higher than using the traditional Rasch model. In sum, our results show that the approach is an improvement in assessment procedures in MOOCs and could serve as an additional source for accurate conclusions on student's proficiency.

[1]  Thomas Eckes,et al.  Many-facet Rasch measurement , 2019, Quantitative Data Analysis for Language Assessment Volume I.

[2]  M. Meulders,et al.  Cross-Classification Multilevel Logistic Models in Psychometrics , 2003 .

[3]  D. Bates,et al.  Fitting Linear Mixed-Effects Models Using lme4 , 2014, 1406.5823.

[4]  H. Akaike A new look at the statistical model identification , 1974 .

[5]  Georg Rasch,et al.  Probabilistic Models for Some Intelligence and Attainment Tests , 1981, The SAGE Encyclopedia of Research Design.

[6]  Chaitanya Ekanadham,et al.  T-SKIRT: Online Estimation of Student Proficiency in an Adaptive Learning System , 2017, ArXiv.

[7]  R. Hambleton,et al.  An NCME Instructional Module on Comparison of Classical Test Theory and Item Response Theory and Their Applications to Test Development. , 2005 .

[8]  D. Borsboom Measuring the mind: Conceptual issues in contemporary psychometrics , 2005 .

[9]  Deniz Senturk-Doganaksoy,et al.  Explanatory Item Response Models: A Generalized Linear and Nonlinear Approach , 2006, Technometrics.

[10]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[11]  Klaas Sijtsma,et al.  Test Length and Decision Quality in Personnel Selection: When Is Short Too Short? , 2012 .

[12]  Abe D. Hofman,et al.  The estimation of item response models with the lmer function from the lme4 package in R , 2011 .

[13]  M. R. Novick,et al.  Statistical Theories of Mental Test Scores. , 1971 .