The use of predicted values for item parameters in item response theory models: an application in intelligence tests

In testing, item response theory models are widely used in order to estimate item parameters and individual abilities. However, even unidimensional models require a considerable sample size so that all parameters can be estimated precisely. The introduction of empirical prior information about candidates and items might reduce the number of candidates needed for parameter estimation. Using data for IQ measurement, this work shows how empirical information about items can be used effectively for item calibration and in adaptive testing. First, we propose multivariate regression trees to predict the item parameters based on a set of covariates related to the item-solving process. Afterwards, we compare the item parameter estimation when tree-fitted values are included in the estimation or when they are ignored. Model estimation is fully Bayesian, and is conducted via Markov chain Monte Carlo methods. The results are two-fold: (a) in item calibration, it is shown that the introduction of prior information is effective with short test lengths and small sample sizes and (b) in adaptive testing, it is demonstrated that the use of the tree-fitted values instead of the estimated parameters leads to a moderate increase in the test length, but provides a considerable saving of resources.

[1]  P. Speckman,et al.  Multivariate Regression Trees for Analysis of Abundance Data , 2004, Biometrics.

[2]  Daniel M. Bolt,et al.  Estimation of Compensatory and Noncompensatory Multidimensional Item Response Models Using Markov Chain Monte Carlo , 2003 .

[3]  B. Veldkamp,et al.  Prior Distributions for Item Parameters in IRT Models , 2012 .

[4]  C. Glas,et al.  Elements of adaptive testing , 2010 .

[5]  Bradley P. Carlin,et al.  Markov Chain Monte Carlo conver-gence diagnostics: a comparative review , 1996 .

[6]  R. D. Bock,et al.  Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm , 1981 .

[7]  M. R. Novick,et al.  Statistical Theories of Mental Test Scores. , 1971 .

[8]  Melvin R. Novick,et al.  Some latent train models and their use in inferring an examinee's ability , 1966 .

[9]  R. Hambleton,et al.  Item Response Theory , 1984, The History of Educational Measurement.

[10]  Glenn De ' ath,et al.  MULTIVARIATE REGRESSION TREES: A NEW TECHNIQUE FOR MODELING SPECIES-ENVIRONMENT RELATIONSHIPS , 2002 .

[11]  Aeilko H. Zwinderman,et al.  Response Models with Manifest Predictors , 1997 .

[12]  G. De’ath MULTIVARIATE REGRESSION TREES: A NEW TECHNIQUE FOR MODELING SPECIES–ENVIRONMENT RELATIONSHIPS , 2002 .

[13]  M. Segal Tree-Structured Methods for Longitudinal Data , 1992 .

[14]  J. Fox,et al.  Bayesian estimation of a multilevel IRT model using gibbs sampling , 2001 .

[15]  Bernard P. Veldkamp,et al.  Including Empirical Prior Information in Test Administration , 2011 .

[16]  Wim J. van der Linden,et al.  Empirical Initialization of the Trait Estimator in Adaptive Testing , 1999 .

[17]  John Geweke,et al.  Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments , 1991 .

[18]  Brian J. Smith,et al.  boa: An R Package for MCMC Output Convergence Assessment and Posterior Inference , 2007 .

[19]  Mariagiulia Matteucci,et al.  Uncertainties in the Item Parameter Estimates and Robust Automated Test Assembly , 2013 .

[20]  Stochastic Relaxation , 2014, Computer Vision, A Reference Guide.

[21]  R. Hambleton,et al.  Handbook of Modern Item Response Theory , 1997 .

[22]  Bernard P. Veldkamp,et al.  Ensuring the future of computerized adaptive testing , 2012 .

[23]  Andrew GelmanyJanuary,et al.  Prior distribution , 2000 .

[24]  Maurizio Vichi,et al.  Classification and multivariate analysis for complex data structures , 2011 .

[25]  D. Bolt D.N.M. de Gruijter & L.J.T. van der Kamp (2008) Statistical Test Theory for the Behavioral Sciences. , 2008 .

[26]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Peter J. Pashley,et al.  Item selection and ability estimation adaptive testing , 2010 .

[28]  B. Veldkamp,et al.  Optimal Testlet Pool Assembly for Multistage Testing Designs , 2006 .

[29]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[30]  Dato N.M. De Gruijter,et al.  Statistical Test Theory for the Behavioral Sciences , 2007 .

[31]  Bernard P. Veldkamp,et al.  Effects of feedback in a computer-based assessment for learning , 2012, Comput. Educ..

[32]  Luigi Salmaso,et al.  Statistical methods for the evaluation of educational services and quality of products , 2009 .

[33]  A. Béguin,et al.  MCMC estimation and some model-fit analysis of multidimensional IRT models , 2001 .

[34]  J. Albert Bayesian Estimation of Normal Ogive Item Response Curves Using Gibbs Sampling , 1992 .

[35]  Guido Makransky,et al.  An Automatic Online Calibration Design in Adaptive Testing , 2010 .

[36]  Mariagiulia Matteucci,et al.  Issues on item response theory modelling , 2009 .

[37]  Robert K. Tsutakawa,et al.  Prior distribution for item response curves , 1992 .

[38]  Richard J. Patz,et al.  A Straightforward Approach to Markov Chain Monte Carlo Methods for Item Response Models , 1999 .

[39]  Nicolas Bousquet,et al.  Diagnostics of prior-data agreement in applied Bayesian analysis , 2008 .

[40]  F. Lord A theory of test scores. , 1952 .

[41]  De Ayala,et al.  The Theory and Practice of Item Response Theory , 2008 .