A Comparison of Selected Empirical Methods for Assessing the Structure of Responses to Test Items

Selected methods of empirically assessing the structure of tests with dichotomous items were compared. The methods included both exploratory and confirmatory procedures from two different families, those based on parametric models and nonparametric methods based on conditional item covariances. The analysis conditions considered were typical of large-scale assessments, for example, the tests were composed of a relatively large number of items, and it was assumed that a relatively large sample size would be available for analysis. Comparisons of the methods were conducted for real data from a 62-item test of reading ability and for computer-generated data for multiple unidimensional and multidimensional cases. For the most part, all methods performed reasonably well over a relatively wide range of conditions. The several exceptions to this outcome occurred when the test data departed appreciably from the assumptions or inherent limitations associated with a method, for example, when guessing was present but not allowed for in the analysis or when the multidimensional test structure was nonsimple but the goal of the method was to estimate the amount of multidimensional simple structure. Index terms: test structure, test dimensionality, local item dependencies, test factors.

[1]  Robert J. Mislevy,et al.  BILOG 3 : item analysis and test scoring with binary logistic models , 1990 .

[2]  Wendy M. Yen,et al.  Scaling Performance Assessments: Strategies for Managing Local Item Dependence , 1993 .

[3]  Ratna Nandakumar,et al.  Traditional Dimensionality Versus Essential Dimensionality , 1991 .

[4]  Mark D. Reckase,et al.  A Linear Logistic Multidimensional Model for Dichotomous Item Response Data , 1997 .

[5]  William F. Strout A new item response theory modeling approach with applications to unidimensionality assessment and ability estimation , 1990 .

[6]  Daniel M. Bolt,et al.  Conditional Covariance-Based Representation of Multidimensional Test Structure , 2001 .

[7]  De Champlain,et al.  Assessing the Effect of Multidimensionality on IRT True-Score Equating for Subgroups of Examinees. , 1995 .

[8]  J. S. Long,et al.  Testing Structural Equation Models , 1993 .

[9]  R. P. McDonald,et al.  Test Theory: A Unified Treatment , 1999 .

[10]  Mark D. Reckase,et al.  The Difficulty of Test Items That Measure More Than One Ability , 1985 .

[11]  P. Boeck,et al.  Confirmatory Analyses of Componential Test Structure Using Multidimensional Item Response Theory. , 1999, Multivariate behavioral research.

[12]  Jeffrey A Douglas,et al.  Item-Bundle DIF Hypothesis Testing: Identifying Suspect Bundles and Assessing Their Differential Functioning , 1996 .

[13]  Robert J. Mislevy,et al.  Recent Developments in the Factor Analysis of Categorical Variables , 1986 .

[14]  L. Roussos A new dimensionality estimation tool for multiple-item tests and a new DIF analysis paradigm based on multidimensionality and construct validity , 1995 .

[15]  Hua-Hua Chang,et al.  DIMTEST: A Fortran Program for Assessing Dimensionality of Binary Item Responses , 1992 .

[16]  William Stout,et al.  A model-based standardization approach that separates true bias/DIF from group ability differences and detects test bias/DTF as well as item bias/DIF , 1993 .

[17]  A. Seraphine The Performance of Dimtest When Latent Trait and Item Difficulty Distributions Differ , 2000 .

[18]  M. Gessaroli,et al.  Using an Approximate Chi-Square Statistic to Test the Number of Dimensions Underlying the Responses to a Set of Items , 1996 .

[19]  Jan de Leeuw,et al.  On the relationship between item response theory and factor analysis of discretized variables , 1987 .

[20]  Stephen G. Sireci,et al.  ON THE RELIABILITY OF TESTLET‐BASED TESTS , 1991 .

[21]  William Stout,et al.  The theoretical detect index of dimensionality and its application to approximate simple structure , 1999 .

[22]  William Stout,et al.  Using New Proximity Measures With Hierarchical Cluster Analysis to Detect Multidimensionality , 1998 .

[23]  D. Thissen,et al.  Local Dependence Indexes for Item Pairs Using Item Response Theory , 1997 .

[24]  James E. Carlson,et al.  Dimensionality of 1990 NAEP Mathematics Data. , 1992 .

[25]  William Stout,et al.  Conditional covariance structure of generalized compensatory multidimensional items , 1999 .

[26]  H. Kim A NEW INDEX OF DIMENSIONALITY - DETECT , 1996 .

[27]  Eric T. Bradlow,et al.  A Bayesian random effects model for testlets , 1999 .

[28]  B. Muthén,et al.  Robust inference using weighted least squares and quadratic estimating equations in latent variable modeling with categorical and continuous outcomes , 1997 .

[29]  Identifiers California,et al.  Annual Meeting of the National Council on Measurement in Education , 1998 .

[30]  M. Reckase Unifactor Latent Trait Models Applied to Multifactor Tests: Results and Implications , 1979 .

[31]  Anne Boomsma,et al.  Essays on Item Response Theory , 2000 .

[32]  Margaret Wu,et al.  ACER conquest: generalised item response modelling software , 1998 .

[33]  Karl G. Jöreskog,et al.  Lisrel 8: User's Reference Guide , 1997 .

[34]  Hae-Rim Kim New techniques for the dimensionality assessment of standardized test data , 1994 .

[35]  R. P. McDonald,et al.  Nonlinear factor analysis. , 1967 .

[36]  B. Junker,et al.  Nonparametric Item Response Theory in Action: An Overview of the Special Issue , 2001 .

[37]  Roderick P. McDonald,et al.  Normal-Ogive Multidimensional Model , 1997 .

[38]  H. Swaminathan,et al.  An Assessment of Stout's Index of Essential Unidimensionality , 1996 .

[39]  R. P. McDonald,et al.  A general approach to nonlinear factor analysis , 1962 .

[40]  R. J. Mokken,et al.  Handbook of modern item response theory , 1997 .

[41]  George Engelhard,et al.  Full-Information Item Factor Analysis: Applications of EAP Scores , 1985 .

[42]  R. Mislevy,et al.  Probability‐Based Inference in a Domain of Proportional Reasoning Tasks , 1996 .

[43]  W. Stout,et al.  An Item Response Theory Model for Test Bias. , 1991 .

[44]  Roderick P. McDonald,et al.  The dimensionality of tests and items , 1981 .

[45]  Ratna Nandakumar,et al.  Refinements of Stout’s Procedure for Assessing Latent Trait Unidimensionality , 1993 .

[46]  Raymond J. Adams,et al.  The Multidimensional Random Coefficients Multinomial Logit Model , 1997 .

[47]  E. Muraki,et al.  Full-Information Item Factor Analysis , 1988 .

[48]  Robert J. Mislevy,et al.  Modeling item responses when different subjects employ different solution strategies , 1990 .

[49]  Roderick P. McDonald,et al.  Factor Analysis and Related Methods , 1985 .

[50]  William Stout,et al.  A nonparametric approach for assessing latent trait unidimensionality , 1987 .

[51]  D. Laveault Modern theories of measurement : problems and issues , 1994 .

[52]  William Stout,et al.  Nonparametric Item Response Theory: A Maturing and Applicable Measurement Modeling Approach , 2001 .

[53]  R. P. McDonald,et al.  Goodness of Fit in Item Response Models. , 1995, Multivariate behavioral research.

[54]  Brian W. Junker,et al.  Essential independence and likelihood-based ability estimation for polytomous items , 1991 .

[55]  A. D. De Champlain,et al.  CHIDIM: A FORTRAN Program for Assessing the Dimensionality of Binary Item Responses Based on Mcdonald's Nonlinear Factor Analytic Model , 1997 .

[56]  Jeffrey Douglas,et al.  Nonparametric Item Response Function Estimation for Assessing Parametric Model Fit , 2001 .

[57]  Rebecca Zwick,et al.  Assessing the Dimensionality of NAEP Reading Data , 1987 .

[58]  Robert J. Mislevy,et al.  TEST THEORY RECONCEIVED , 1994 .

[59]  C. Parsons,et al.  Application of Unidimensional Item Response Theory Models to Multidimensional Data , 1983 .

[60]  R. Nandakumar Assessing Essential Unidimensionality of Real Data , 1992 .

[61]  Karl G. Jöreskog,et al.  New developments in LISREL: analysis of ordinal variables using polychoric correlations and weighted least squares , 1990 .

[62]  A. Froelich A New Bias Correction Method for the DIMTEST Procedure , 2001 .

[63]  R. Nandakumar,et al.  Empirical Validation of DIMTEST on Nonnormal Ability Distributions , 1996 .

[64]  John Hattie,et al.  Methodology Review: Assessing Unidimensionality of Tests and ltenls , 1985 .

[65]  Terry A. Ackerman Using multidimensional item response theory to understand what items and tests are measuring , 1994 .

[66]  Wendy M. Yen,et al.  Effects of Local Item Dependence on the Fit and Equating Performance of the Three-Parameter Logistic Model , 1984 .

[67]  John Hattie,et al.  An Empirical Study of Various Indices for Determining Unidimensionality. , 1984, Multivariate behavioral research.

[68]  R. Nandakumar Assessing Dimensionality of a Set of Item Responses--Comparison of Different Approaches. , 1994 .

[69]  Colin Fraser,et al.  NOHARM: Least Squares Item Factor Analysis. , 1988, Multivariate behavioral research.

[70]  H. Wainer,et al.  Differential Item Functioning. , 1994 .

[71]  Shameem Nyla NATIONAL COUNCIL ON MEASUREMENT IN EDUCATION , 2004 .

[72]  Terry Ackerman,et al.  Graphical Representation of Multidimensional Item Response Theory Analyses , 1996 .

[73]  B. Muthén A general structural equation model with dichotomous, ordered categorical, and continuous latent variable indicators , 1984 .

[74]  R. Hambleton,et al.  Assessing the Dimensionality of a Set of Test Items , 1986 .

[75]  Robert A. Forsyth,et al.  An Examination of the Characteristics of Unidimensional IRT Parameter Estimates Derived From Two-Dimensional Data , 1985 .

[76]  Brian Habing,et al.  Conditional Covariance-Based Nonparametric Multidimensionality Assessment , 1996 .

[77]  Mark D. Reckase,et al.  The Discriminating Power of Items That Measure More Than One Dimension , 1991 .

[78]  D. Knol,et al.  Empirical Comparison Between Factor Analysis and Multidimensional Item Response Models. , 1991, Multivariate behavioral research.