SPRITE : A Data-Driven Response Model For Multiple Choice Questions

Item response theory (IRT) models, in their most basic, dichotomous form, model a set of binary-valued (correct/incorrect) responses from individuals to items/questions. These models are ubiquitous in computer-based learning analytics and assessment applications, because they enable the inference of latent student abilities/respondent traits. Since the option a student selects on a multiple-choice question (either the correct response or one of the incorrect, distractor responses) contains more information regarding the student’s ability than a simple binary-valued grade, polytomous IRT models have been developed to cover the cases of unordered (i.e., categorical) options and strictly ordered (i.e., ordinal) options. However, in many real-world educational scenarios, the various distractor options in a multiple-choice question are neither categorical, since they are incorrect to varying degrees, nor ordinal, since they are not strictly ordered. Moreover, this (partial) ordering information might not be known a priori, inhibiting the application of existing polytomous IRT models to practical scenarios. In this work, we propose the SPRITE (short for stochastic polytomous response item) model, a novel IRT extension for multiple-choice questions with unknown, partially ordered options. SPRITE improves substantially over existing IRT models in that it (i) learns the (partial) ordering of the options directly from student response data, (ii) produces interpretable model parameters, and (iii) outperforms existing approaches on predicting unobserved student responses on multiple real-world educational datasets.

[1]  Peter Brusilovsky,et al.  Adaptive and Intelligent Web-based Educational Systems , 2003, Int. J. Artif. Intell. Educ..

[2]  D. Thissen,et al.  Multiple-Choice Models: The Distractors Are also Part of the Item. , 1989 .

[3]  Hagai Attias,et al.  Inferring Parameters and Structure of Latent Variable Models by Variational Bayes , 1999, UAI.

[4]  J. Ware,et al.  Practical implications of item response theory and computerized adaptive testing: a brief summary of ongoing studies of widely used headache impact scales. , 2000, Medical care.

[5]  Walter R. Gilks,et al.  Adaptive rejection metropolis sampling , 1995 .

[6]  Kenneth R. Koedinger,et al.  Learning Factors Transfer Analysis: Using Learning Curve Analysis to Automatically Generate Domain Models , 2009, EDM.

[7]  Richard G. Baraniuk,et al.  Tag-Aware Ordinal Sparse Factor Analysis for Learning and Content Analytics , 2014, EDM.

[8]  Anton Beguin,et al.  Using Classical Test Theory in Combination with Item Response Theory , 2003 .

[9]  Jim Albert,et al.  Ordinal Data Modeling , 2000 .

[10]  Willem J. van der Linden,et al.  Bayesian item selection criteria for adaptive testing , 1998 .

[11]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[12]  A. Béguin,et al.  MCMC estimation and some model-fit analysis of multidimensional IRT models , 2001 .

[13]  Li Cai,et al.  Metropolis-Hastings Robbins-Monro Algorithm for Confirmatory Item Factor Analysis , 2010 .

[14]  Stephen P. Boyd,et al.  Compressed Sensing With Quantized Measurements , 2010, IEEE Signal Processing Letters.

[15]  E. Muraki A GENERALIZED PARTIAL CREDIT MODEL: APPLICATION OF AN EM ALGORITHM , 1992 .

[16]  Jimmy de la Torre,et al.  A Cognitive Diagnosis Model for Cognitively Based Multiple-Choice Options. , 2009 .

[17]  Tiffany Barnes,et al.  The Q-matrix Method: Mining Student Response Data for Knowledge , 2005 .

[18]  Kikumi K. Tatsuoka,et al.  A Probabilistic Model for Diagnosing Misconceptions By The Pattern Classification Approach , 1985 .

[19]  Pablo A. Parrilo,et al.  Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization , 2007, SIAM Rev..

[20]  F. Samejima Graded Response Model , 1997 .

[21]  Richard G. Baraniuk,et al.  Test-size Reduction for Concept Estimation , 2013, EDM.

[22]  F. Lord Applications of Item Response Theory To Practical Testing Problems , 1980 .

[23]  John R. Anderson,et al.  Knowledge tracing: Modeling the acquisition of procedural knowledge , 2005, User Modeling and User-Adapted Interaction.

[24]  L. Carin,et al.  Nonparametric Bayesian matrix completion , 2010, 2010 IEEE Sensor Array and Multichannel Signal Processing Workshop.

[25]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[26]  Cornelis A.W. Glas,et al.  Computerized adaptive testing : theory and practice , 2000 .

[27]  Kurt VanLehn,et al.  The Andes Physics Tutoring System: Lessons Learned , 2005, Int. J. Artif. Intell. Educ..

[28]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[29]  Joachim M. Buhmann,et al.  Cluster analysis of heterogeneous rank data , 2007, ICML '07.

[30]  Georg Rasch,et al.  Probabilistic Models for Some Intelligence and Attainment Tests , 1981, The SAGE Encyclopedia of Research Design.

[31]  Michael L. Nering,et al.  Handbook of Polytomous Item Response Theory Models , 2010 .

[32]  G. Masters A rasch model for partial credit scoring , 1982 .

[33]  David Thissen,et al.  A response model for multiple choice items , 1984 .

[34]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[35]  M. Reckase Multidimensional Item Response Theory , 2009 .