The Rasch Testlet Model

The Rasch testlet model for both dichotomous and polytomous items in testlet-based tests is proposed. It can be viewed as a special case of the multidimensional random coefficients multinomial logit model (MRCMLM). Therefore, the estimation procedures for the MRCMLM can be directly applied. Simulations were conducted to examine parameter recovery under the dichotomous Rasch testlet model and the partial-credit testlet model. Results indicated that the item and person parameters as well as the random testlet effects could be recovered very accurately under all the simulated conditions. As sample sizes were increased, the root mean square errors of the estimates decreased to an acceptable level. An empirical example of an English test with 11 testlets was given. Index terms: multidimensional item response model, item bundle, marginal maximum likelihood estimation, parameter recovery.

[1]  R. Darrell Bock,et al.  Estimating item parameters and latent ability when responses are scored in two or more nominal categories , 1972 .

[2]  Eric R. Ziegel,et al.  Generalized Linear Models , 2002, Technometrics.

[3]  Wendy M. Yen,et al.  Scaling Performance Assessments: Strategies for Managing Local Item Dependence , 1993 .

[4]  Howard Wainer,et al.  Precision and Differential Item Functioning on a Testlet-Based Test: The 1991 Law School Admissions Test as an Example , 1995 .

[5]  E. Muraki A GENERALIZED PARTIAL CREDIT MODEL: APPLICATION OF AN EM ALGORITHM , 1992 .

[6]  Paul De Boeck,et al.  A parametric model for local dependence among test items. , 1997 .

[7]  Raymond J. Adams,et al.  The Multidimensional Random Coefficients Multinomial Logit Model , 1997 .

[8]  Francis Tuerlinckx,et al.  A nonlinear mixed model framework for item response theory. , 2003, Psychological methods.

[9]  Paul De Boeck,et al.  Multidimensional Componential Item Response Theory Models for Polytomous Items , 2001 .

[10]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[11]  Eric T. Bradlow,et al.  A General Bayesian Model for Testlets: Theory and Applications , 2002 .

[12]  Russell D. Wolfinger,et al.  Fitting Nonlinear Mixed Models with the New NLMIXED Procedure , 1999 .

[13]  R. D. Bock,et al.  Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm , 1981 .

[14]  G. H. Fischer,et al.  An extension of the partial credit model with an application to the measurement of change , 1994 .

[15]  Cornelis A.W. Glas,et al.  Computerized adaptive testing : theory and practice , 2000 .

[16]  Howard Wainer,et al.  How Reliable are TOEFL Scores? , 1997 .

[17]  Gerhard H. Fischer,et al.  An extension of the rating scale model with an application to the measurement of change , 1991 .

[18]  Wen-Chung Wang,et al.  Gain Score in Item Response Theory as an Effect Size Measure , 2004 .

[19]  D. Andrich A rating formulation for ordered response categories , 1978 .

[20]  P. McCullagh,et al.  Generalized Linear Models , 1992 .

[21]  G. H. Fischer,et al.  The linear logistic test model as an instrument in educational research , 1973 .

[22]  Robert J. Mislevy,et al.  Estimating Population Characteristics From Sparse Matrix Samples of Item Responses , 1992 .

[23]  Howard Wainer,et al.  How Is Reliability Related to the Quality of Test Scores? What Is the Effect of Local Dependence on Reliability? , 1998 .

[24]  Deniz Senturk-Doganaksoy,et al.  Explanatory Item Response Models: A Generalized Linear and Nonlinear Approach , 2006, Technometrics.

[25]  Paul De Boeck,et al.  The Random Weights Linear Logistic Test Model , 2002 .

[26]  Herbert Hoijtink,et al.  Confirmatory factor analysis of items with a dichotomous response format using the multidimensional Rasch model. , 1999 .

[27]  Georg Rasch,et al.  Probabilistic Models for Some Intelligence and Attainment Tests , 1981, The SAGE Encyclopedia of Research Design.

[28]  Stephen G. Sireci,et al.  ON THE RELIABILITY OF TESTLET‐BASED TESTS , 1991 .

[29]  David Thissen,et al.  Trace Lines for Testlets: A Use of Multiple-Categorical-Response Models. , 1989 .

[30]  Eric T. Bradlow,et al.  A General Bayesian Model for Testlets: Theory and Applications , 2002 .

[31]  J. H. Schuenemeyer,et al.  Generalized Linear Models (2nd ed.) , 1992 .

[32]  Wen-Chung Wang,et al.  Local Item Dependence for Items Across Tests Connected by Common Stimuli , 2005 .

[33]  Raymond J. Adams,et al.  Rasch models for item bundles , 1995 .

[34]  Wen-Chung Wang,et al.  The Standardized Mean Difference within the Framework of Item Response Theory , 2004 .

[35]  Peter Congdon,et al.  Applied Bayesian Modelling , 2003 .

[36]  A. Brix Bayesian Data Analysis, 2nd edn , 2005 .

[37]  R. D. Bock,et al.  Adaptive EAP Estimation of Ability in a Microcomputer Environment , 1982 .

[38]  F. Samejima Estimation of latent ability using a response pattern of graded scores , 1969 .

[39]  Howard Wainer,et al.  Item Clusters and Computerized Adaptive Testing: A Case for Testlets , 1987 .

[40]  M. R. Novick,et al.  Statistical Theories of Mental Test Scores. , 1971 .

[41]  L. Joseph,et al.  Bayesian Statistics: An Introduction , 1989 .

[42]  Adrian F. M. Smith,et al.  Sampling-Based Approaches to Calculating Marginal Densities , 1990 .

[43]  Howard Wainer,et al.  Testlet Response Theory: An Analog for the 3PL Model Useful in Testlet-Based Adaptive Testing , 2000 .

[44]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[45]  Mark R. Wilson,et al.  The Ordered artition Model: An Extension of the Partial Credit Model , 1992 .

[46]  Melvin R. Novick,et al.  Some latent train models and their use in inferring an examinee's ability , 1966 .

[47]  Eric T. Bradlow,et al.  MML and EAP Estimation in Testlet-based Adaptive Testing , 2000 .

[48]  Interpreting the parameters of a multidimensional Rasch model , 2000 .

[49]  Wen-Chung Wang,et al.  Improving measurement precision of test batteries using multidimensional item response models. , 2004, Psychological methods.

[50]  R. Hilborn,et al.  Fisheries stock assessment and decision analysis: the Bayesian approach , 1997, Reviews in Fish Biology and Fisheries.

[51]  H. Wainer,et al.  USING A NEW STATISTICAL MODEL FOR TESTLETS TO SCORE TOEFL , 2000 .

[52]  G. Masters A rasch model for partial credit scoring , 1982 .

[53]  J. Linacre,et al.  Many-facet Rasch measurement , 1994 .

[54]  F Tuerlinckx,et al.  The effect of ignoring item interactions on the estimated discrimination parameters in item response theory. , 2001, Psychological methods.

[55]  Karen Draney,et al.  Objective measurement : theory into practice , 1992 .

[56]  Mark Wilson,et al.  Rasch models of multidimensionality between items and within items , 1997 .

[57]  Eric T. Bradlow,et al.  A Bayesian random effects model for testlets , 1999 .

[58]  Mark Von Tress,et al.  Generalized, Linear, and Mixed Models , 2003, Technometrics.