Modeling Rule-Based Item Generation

An application of a hierarchical IRT model for items in families generated through the application of different combinations of design rules is discussed. Within the families, the items are assumed to differ only in surface features. The parameters of the model are estimated in a Bayesian framework, using a data-augmented Gibbs sampler. An obvious application of the model is computerized algorithmic item generation. Such algorithms have the potential to increase the cost-effectiveness of item generation as well as the flexibility of item administration. The model is applied to data from a non-verbal intelligence test created using design rules. In addition, results from a simulation study conducted to evaluate parameter recovery are presented.

[1]  J. Schepers,et al.  Models with item and item group predictors , 2004 .

[2]  A. Zellner An Efficient Method of Estimating Seemingly Unrelated Regressions and Tests for Aggregation Bias , 1962 .

[3]  Thomas M. Haladyna,et al.  A technology for test-item writing , 1981 .

[4]  Andrew Gelman,et al.  Bayesian Measures of Explained Variance and Pooling in Multilevel (Hierarchical) Models , 2006, Technometrics.

[5]  M. Plummer,et al.  CODA: convergence diagnosis and output analysis for MCMC , 2006 .

[6]  Bradley P. Carlin,et al.  Bayesian measures of model complexity and fit , 2002 .

[7]  Francis Tuerlinckx,et al.  A nonlinear mixed model framework for item response theory. , 2003, Psychological methods.

[8]  A. Béguin,et al.  MCMC estimation and some model-fit analysis of multidimensional IRT models , 2001 .

[9]  Wells HivelyII,et al.  A “UNIVERSE‐DEFINED” SYSTEM OF ARITHMETIC ACHIEVEMENT TESTS1 , 1968 .

[10]  M. Tanner Tools for statistical inference: methods for the exploration of posterior distributions and likeliho , 1994 .

[11]  Heinz Holling,et al.  Automatic item generation of probability word problems , 2009 .

[12]  Wim J. van der Linden,et al.  Capitalization on Item Calibration Error in Adaptive Testing , 1998 .

[13]  John R. Bormuth,et al.  On the theory of achievement test items , 1970 .

[14]  Richard M. Luecht Adaptive Computer-Based Tasks Under an Assessment Engineering Paradigm , 2009 .

[15]  H. Holling,et al.  Explaining and Controlling for the Psychometric Properties of Computer-Generated Figural Matrix Items , 2008 .

[16]  Cornelis A.W. Glas,et al.  Modeling Variability in Item Parameters in Item Response Models. Research Report. , 2001 .

[17]  David M. Williamson,et al.  Calibrating Item Families and Summarizing the Results Using Family Expected Response Functions , 2003 .

[18]  Jim Albert,et al.  Ordinal Data Modeling , 2000 .

[19]  J. Laros,et al.  The Construction and Validation of a Nonverbal Test of Intelligence: The revision of the Snijders-Oomen tests , 1993 .

[20]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[21]  Philip Heidelberger,et al.  Simulation Run Length Control in the Presence of an Initial Transient , 1983, Oper. Res..

[22]  A. Raftery,et al.  How Many Iterations in the Gibbs Sampler , 1991 .

[23]  Robert J. Mislevy,et al.  26 Bayesian Psychometric Modeling From An Evidence-Centered Design Perspective , 2006 .

[24]  Georg Rasch,et al.  Probabilistic Models for Some Intelligence and Attainment Tests , 1981, The SAGE Encyclopedia of Research Design.

[25]  Sophia Rabe-Hesketh,et al.  Alternating imputation posterior estimation of models with crossed random effects , 2011, Comput. Stat. Data Anal..

[26]  Paul De Boeck,et al.  The Random Weights Linear Logistic Test Model , 2002 .

[27]  H. G. Osburn,et al.  Item Sampling for Achievement Testing , 1968 .

[28]  Wim J. van der Linden,et al.  Estimation of the parameters in an item-cloning model for adaptive testing , 2009 .

[29]  Jason Millman,et al.  Computer‐Assisted Writing of Achievement Test Items: Toward a Future Technology , 1989 .

[30]  L. Mark Berliner,et al.  Subsampling the Gibbs Sampler , 1994 .

[31]  Francis Tuerlinckx,et al.  A Hierarchical IRT Model for Criterion-Referenced Measurement , 2000 .

[32]  J. Albert Bayesian Estimation of Normal Ogive Item Response Curves Using Gibbs Sampling , 1992 .

[33]  Cornelis A.W. Glas,et al.  Item Parameter Estimation and Item Fit Analysis , 2009 .

[34]  P. Boeck,et al.  Explanatory item response models : a generalized linear and nonlinear approach , 2004 .

[35]  John Geweke,et al.  Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments , 1991 .

[36]  Susan E. Embretson,et al.  Generating items during testing: Psychometric issues and models , 1999 .

[37]  M. Meulders,et al.  Cross-Classification Multilevel Logistic Models in Psychometrics , 2003 .

[38]  G. H. Fischer,et al.  The linear logistic test model as an instrument in educational research , 1973 .

[39]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[40]  J. Fox,et al.  Bayesian estimation of a multilevel IRT model using gibbs sampling , 2001 .

[41]  Wim J. van der Linden,et al.  Computerized Adaptive Testing With Item Cloning , 2003 .

[42]  C. Glas,et al.  Elements of adaptive testing , 2010 .

[43]  Klaas Sijtsma,et al.  New Developments in Categorical Data Analysis for the Social and Behavioral Sciences , 2005 .

[44]  Jean-Paul Fox,et al.  Multilevel IRT model assessment , 2005 .

[45]  W. Griffiths,et al.  GIBBS SAMPLERS FOR A SET OF SEEMINGLY UNRELATED REGRESSIONS , 2006 .

[46]  Jacob Arie Laros,et al.  Construction and validation of the SON-R 5 1/2-17, the Snijders-Oomen non-verbal intelligence test , 1991 .

[47]  J. Q. Smith,et al.  1. Bayesian Statistics 4 , 1993 .

[48]  Paul De Boeck,et al.  Random Item IRT Models , 2008 .