ASSESSING FIT OF MODELS WITH DISCRETE PROFICIENCY VARIABLES IN EDUCATIONAL ASSESSMENT

Model checking is a crucial part of any statistical analysis. As educators tie models for testing to cognitive theory of the domains, there is a natural tendency to represent participant proficiencies with latent variables representing the presence or absence of the knowledge, skills, and proficiencies to be tested (Mislevy, Almond, Yan, & Steinberg, 2001). Model checking for these models is not straightforward, mainly because traditional χ2-type tests do not apply except for assessments with a small number of items. Williamson, Mislevy, and Almond (2000) note a lack of published diagnostic tools for these models. This paper suggests a number of graphics and statistics for diagnosing problems with models with discrete proficiency variables. A small diagnostic assessment first analyzed by Tatsuoka (1990) serves as a test bed for these tools. This work is a continuation of the recent work by Yan, Mislevy, and Almond (2003) on this data set. Two diagnostic tools that prove useful are Bayesian residual plots and an analog of the item characteristic curve (ICC) plots. A χ2-type statistic based on the latter plot shows some promise, but more work is required to establish the null distribution of the statistic. On the basis of the identified problems with the model used by Mislevy (1995), the suggested diagnostics are helpful to hypothesize an improved model that seems to fit better.

[1]  I. W. Molenaar,et al.  A multidimensional item response model: Constrained latent class analysis using the gibbs sampler and posterior predictive checks , 1997 .

[2]  Mary F. Klein Logical Error Analysis and Construction of Tests to Diagnose Student "Bugs" in Addition and Subtraction of Fractions. , 1981 .

[3]  B. Junker,et al.  Cognitive Assessment Models with Few Assumptions, and Connections with Nonparametric Item Response Theory , 2001 .

[4]  Russell G. Almond,et al.  Bayes Nets in Educational Assessment: Where the Numbers Come From , 1999, UAI.

[5]  D. Rubin Bayesianly Justifiable and Relevant Frequency Calculations for the Applied Statistician , 1984 .

[6]  Russell G. Almond,et al.  Model Criticism of Bayesian Networks with Latent Variables , 2000, UAI.

[7]  Russell G. Almond,et al.  Modeling Conditional Probabilities in Complex Educational Assessments. CSE Technical Report. , 2002 .

[8]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[9]  K. Tatsuoka Toward an Integration of Item-Response Theory and Cognitive Error Diagnosis. , 1987 .

[10]  R. J. Mokken,et al.  Handbook of modern item response theory , 1997 .

[11]  R. Hambleton,et al.  Item Response Theory , 1984, The History of Educational Measurement.

[12]  R. Hambleton Principles and selected applications of item response theory. , 1989 .

[13]  I. Guttman The Use of the Concept of a Future Observation in Goodness‐Of‐Fit Problems , 1967 .

[14]  Matthew S. Johnson,et al.  Measuring Appropriability in Research and Development with Item Response Models , 1999 .

[15]  Xiao-Li Meng,et al.  POSTERIOR PREDICTIVE ASSESSMENT OF MODEL FITNESS VIA REALIZED DISCREPANCIES , 1996 .

[16]  R. Hambleton,et al.  ANALYSIS OF EMPIRICAL DATA USING TWO LOGISTIC LATENT TRAIT MODELS , 1973 .

[17]  Robert J. Mislevy,et al.  PROBABILITY‐BASED INFERENCE IN COGNITIVE DIAGNOSIS , 1994 .

[18]  K. Chaloner,et al.  A Bayesian approach to outlier detection and residual analysis , 1988 .

[19]  David J. Spiegelhalter,et al.  Local computations with probabilities on graphical structures and their application to expert systems , 1990 .

[20]  H. Stern,et al.  VARIANCE COMPONENT TESTING IN GENERALIZED LINEAR MIXED MODELS , 2003 .

[21]  K. Tatsuoka RULE SPACE: AN APPROACH FOR DEALING WITH MISCONCEPTIONS BASED ON ITEM RESPONSE THEORY , 1983 .

[22]  M. J. Bayarri,et al.  P Values for Composite Null Models , 2000 .

[23]  Sandip Sinharay,et al.  SIMULATION STUDIES APPLYING POSTERIOR PREDICTIVE MODEL CHECKING FOR ASSESSING FIT OF THE COMMON ITEM RESPONSE THEORY MODELS , 2003 .

[24]  B. Junker,et al.  Cognitive Assessment Models with Few Assumptions , and Connections with Nonparametric IRT , 2001 .

[25]  W. M. Yen Using Simulation Results to Choose a Latent Trait Model , 1981 .

[26]  Kikumi K. Tatsuoka,et al.  Spotting Erroneous Rules of Operation by the Individual Consistency Index. , 1983 .

[27]  R. Almond,et al.  Focus Article: On the Structure of Educational Assessments , 2003 .

[28]  Russell G. Almond,et al.  Graphical Models and Computerized Adaptive Testing , 1998 .

[29]  S. Chib,et al.  Bayesian residual analysis for binary response regression models , 1995 .

[30]  Russell G. Almond,et al.  DESIGN AND ANALYSIS IN A COGNITIVE ASSESSMENT , 2003 .

[31]  D. Thissen,et al.  Likelihood-Based Item-Fit Indices for Dichotomous Item Response Theory Models , 2000 .

[32]  Robert J. Mislevy,et al.  The role of probability-based inference in an intelligent tutoring system , 2005, User Modeling and User-Adapted Interaction.

[33]  Walter R. Gilks,et al.  BUGS - Bayesian inference Using Gibbs Sampling Version 0.50 , 1995 .