Calibration of Polytomous Item Families Using Bayesian Hierarchical Modeling

For complex educational assessments, there is an increasing use of item families, which are groups of related items. Calibration or scoring in an assessment involving item families requires models that can take into account the dependence structure inherent among the items that belong to the same item family. This article extends earlier works in three directions: (a) extends the model to take into account item families with polytomous items and implements a Markov chain Monte Carlo algorithm for the estimation of the model parameters, (b) generalizes family response functions to polytomous item families and defines the family score function as ways to examine each family graphically, and (c) uses Bayes factors to select either the more complicated model or a simple model. All three extensions of the earlier works on item families are demonstrated using two data sets: one simulated and one from the National Assessment of Educational Progress.

[1]  E. Muraki A Generalized Partial Credit Model: Application of an EM Algorithm , 1992 .

[2]  Saba Rizavi,et al.  Tolerable Variation in Item Parameter Estimates. , 2002 .

[3]  F. Lord Applications of Item Response Theory To Practical Testing Problems , 1980 .

[4]  James M. Dickey,et al.  Matricvariate Generalizations of the Multivariate $t$ Distribution and the Inverted Multivariate $t$ Distribution , 1967 .

[5]  Sandip Sinharay,et al.  ASSESSING CONVERGENCE OF THE MARKOV CHAIN MONTE CARLO ALGORITHMS: A REVIEW , 2003 .

[6]  David M. Williamson,et al.  AN APPLICATION OF A BAYESIAN HIERARCHICAL MODEL FOR ITEM FAMILY CALIBRATION , 2003 .

[7]  Brian W. Junker,et al.  Applications and Extensions of MCMC in IRT: Multiple Item Types, Missing Data, and Rated Responses , 1999 .

[8]  Richard J. Patz,et al.  A Straightforward Approach to Markov Chain Monte Carlo Methods for Item Response Models , 1999 .

[9]  Wim J. van der Linden,et al.  Computerized Adaptive Testing With Item Cloning , 2003 .

[10]  Paul Deane,et al.  Automatic Item Generation via Frame Semantics: Natural Language Generation of Math Word Problems. , 2003 .

[11]  Randy Elliot Bennett,et al.  Item generation and beyond: Applications of schema theory to mathematics assessment. , 2002 .

[12]  Ronald J. M. M. Does,et al.  A stochastic growth model applied to repeated tests of academic knowledge , 1989 .

[13]  Isaac I. Bejar GENERATIVE RESPONSE MODELING: LEVERAGING THE COMPUTER AS A TEST DELIVERY MEDIUM , 1996 .

[14]  Sandip Sinharay,et al.  PRACTICAL APPLICATIONS OF POSTERIOR PREDICTIVE MODEL CHECKING FOR ASSESSING FIT OF COMMON ITEM RESPONSE THEORY MODELS , 2003 .

[15]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Francis Tuerlinckx,et al.  A Hierarchical IRT Model for Criterion-Referenced Measurement , 2000 .

[17]  David M. Williamson,et al.  Calibrating Item Families and Summarizing the Results Using Family Expected Response Functions , 2003 .

[18]  Eric T. Bradlow,et al.  A Bayesian random effects model for testlets , 1999 .

[19]  Peter Green,et al.  Markov chain Monte Carlo in Practice , 1996 .

[20]  Sandip Sinharay,et al.  SIMULATION STUDIES APPLYING POSTERIOR PREDICTIVE MODEL CHECKING FOR ASSESSING FIT OF THE COMMON ITEM RESPONSE THEORY MODELS , 2003 .

[21]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[22]  Melvin R. Novick,et al.  Some latent train models and their use in inferring an examinee's ability , 1966 .

[23]  Eric T. Bradlow,et al.  A General Bayesian Model for Testlets: Theory and Applications , 2002 .

[24]  Brian W. Junker,et al.  The Hierarchical Rater Model for Rated Test Items and its Application to Large-Scale Educational Assessment Data , 2002 .

[25]  Sandip Sinharay,et al.  Experiences With Markov Chain Monte Carlo Convergence Assessment in Two Psychometric Examples , 2004 .

[26]  Identifiers California,et al.  Annual Meeting of the National Council on Measurement in Education , 1998 .