PRACTICAL APPLICATIONS OF POSTERIOR PREDICTIVE MODEL CHECKING FOR ASSESSING FIT OF COMMON ITEM RESPONSE THEORY MODELS

Model checking in item response theory (IRT) is an underdeveloped area. No unanimous model checking tool exists in this field even for the simple models, not to speak of the recently suggested complicated models. The posterior predictive model checking method (Guttman, 1967; Rubin, 1981, 1984) is a popular Bayesian model checking tool because of its simplicity, strong theoretical basis, intuitive appeal, and ability to provide graphical evidence. This paper applies the method to a number of real data examples. An important issue with the application of the posterior predictive model checking method is the choice of discrepancy measures (which are like the test statistics in classical hypothesis testing). This paper also uses the discrepancy measures that are found useful in Sinharay and Johnson (2003). The posterior predictive model checking method seems to be promising in detecting different types of misfit of the common IRT models in real applications.

[1]  I. W. Molenaar,et al.  A multidimensional item response model: Constrained latent class analysis using the gibbs sampler and posterior predictive checks , 1997 .

[2]  I. Guttman The Use of the Concept of a Future Observation in Goodness‐Of‐Fit Problems , 1967 .

[3]  Karl Pearson,et al.  ON A NEW METHOD OF DETERMINING CORRELATION BETWEEN A MEASURED CHARACTER A, AND A CHARACTER B, OF WHICH ONLY THE PERCENTAGE OF CASES WHEREIN B EXCEEDS (OR FALLS SHORT OF) A GIVEN INTENSITY IS RECORDED FOR EACH GRADE OF A , 1909 .

[4]  Sandip Sinharay,et al.  SIMULATION STUDIES APPLYING POSTERIOR PREDICTIVE MODEL CHECKING FOR ASSESSING FIT OF THE COMMON ITEM RESPONSE THEORY MODELS , 2003 .

[5]  William Stout,et al.  A nonparametric approach for assessing latent trait unidimensionality , 1987 .

[6]  Hal S. Stern,et al.  Asymptotic Distribution of P Values in Composite Null Models: Comment , 2000 .

[7]  Herbert Hoijtink,et al.  Conditional Independence and Differential Item Functioning in the Two-Parameter Logistic Model , 2001 .

[8]  Sandip Sinharay,et al.  ASSESSING CONVERGENCE OF THE MARKOV CHAIN MONTE CARLO ALGORITHMS: A REVIEW , 2003 .

[9]  K. Tatsuoka Toward an Integration of Item-Response Theory and Cognitive Error Diagnosis. , 1987 .

[10]  R. Hambleton Principles and selected applications of item response theory. , 1989 .

[11]  Eric T. Bradlow,et al.  A Bayesian random effects model for testlets , 1999 .

[12]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[13]  Mark D. Reckase,et al.  A Linear Logistic Multidimensional Model for Dichotomous Item Response Data , 1997 .

[14]  Robert J. Mislevy,et al.  PROBABILITY‐BASED INFERENCE IN COGNITIVE DIAGNOSIS , 1994 .

[15]  Georg Rasch,et al.  Probabilistic Models for Some Intelligence and Attainment Tests , 1981, The SAGE Encyclopedia of Research Design.

[16]  Xiao-Li Meng,et al.  POSTERIOR PREDICTIVE ASSESSMENT OF MODEL FITNESS VIA REALIZED DISCREPANCIES , 1996 .

[17]  M.J.H. van Onna,et al.  Bayesian estimation and model selection in ordered latent class models for polytomous items , 2002 .

[18]  R. Scheines,et al.  Bayesian estimation and testing of structural equation models , 1999 .

[19]  Jean-Paul Fox,et al.  Bayesian modeling of measurement error in predictor variables using item response theory , 2003 .

[20]  Paul W. Holland The Dutch Identity: A new tool for the study of item response models , 1990 .

[21]  P. Holland On the sampling theory roundations of item response theory models , 1990 .

[22]  Francis Tuerlinckx,et al.  A Hierarchical IRT Model for Criterion-Referenced Measurement , 2000 .

[23]  R. Hambleton,et al.  Handbook of Modern Item Response Theory , 1997 .

[24]  D. Rubin Estimation in Parallel Randomized Experiments , 1981 .

[25]  Arnold L. van den Wollenberg,et al.  Two new test statistics for the rasch model , 1982 .

[26]  A. Agresti Categorical data analysis , 1993 .

[27]  Melvin R. Novick,et al.  Some latent train models and their use in inferring an examinee's ability , 1966 .

[28]  William Stout,et al.  The theoretical detect index of dimensionality and its application to approximate simple structure , 1999 .

[29]  Deborah L. Schnipke,et al.  Modeling Item Response Times With a Two-State Mixture Model: A New Method of Measuring , 1997 .

[30]  Mark Reiser,et al.  Analysis of residuals for the multionmial item response model , 1996 .

[31]  D. Rubin Bayesianly Justifiable and Relevant Frequency Calculations for the Applied Statistician , 1984 .

[32]  Ivo W. Molenaar,et al.  Some improved diagnostics for failure of the Rasch model , 1983 .