Assessing the Dimensionality of Constructed-Response Tests Using Hierarchical Cluster Analysis: A Monte Carlo Study.

This study explored two methods that are used to assess the dimensionality of item response data. The paper begins with a discussion of the assessment dimensionality and the use of factor-analytic procedures. A number of problems associated with using linear factor analyses to assess dimensionality are also considered. A procedure is presented for hierarchical cluster analysis in combination with a new proximity measure. A simulation was performed to study how well the different cluster methods (group average, centroid, and Ward's cluster method) recovered unidimensional and multidimensional data and whether different cluster methods overor underestimated the number of dimensions in unidimensional or multidimensional data. In the simulation, only the centroid cluster method recovered the true dimensionality of simulated unidimensional data reasonably well and only in shorter tests. For all other conditions, the three cluster methods consistently overshadowed the true dimensionality of the simulated data. For three-dimensional data, Ward's cluster method was the best performing, and only the group average and Ward's cluster method recovered the multidimensional data well. Implications for practitioners are discussed. (Contains 6 tables and 46 references.) (SLD) Reproductions supplied by EDRS are the best that can be made from the original document. Assessing the dimensionality of constructed-response tests using hierarchical cluster analysis: A Monte Carlo study PERMISSION TO REPRODUCE AND DISSEMINATE THIS MATERIAL HAS BEEN GRANTED BY TO THE EDUCATIONAL RESOURCES INFORMATION CENTER (ERIC)

[1]  M. R. Novick,et al.  Statistical Theories of Mental Test Scores. , 1971 .

[2]  R. Hambleton,et al.  Assessing the Dimensionality of a Set of Test Items , 1986 .

[3]  Roderick P. McDonald,et al.  The dimensionality of tests and items , 1981 .

[4]  Timothy R. Miller,et al.  Cluster Analysis of Angular Data in Applications of Multidimensional Item-Response Theory , 1992 .

[5]  Robert J. Mislevy,et al.  Recent Developments in the Factor Analysis of Categorical Variables , 1986 .

[6]  James E. Carlson,et al.  Full-Information Factor Analysis for Polytomous Item Responses , 1995 .

[7]  G. W. Milligan,et al.  An examination of procedures for determining the number of clusters in a data set , 1985 .

[8]  G. W. Milligan,et al.  The validation of four ultrametric clustering algorithms , 1980, Pattern Recognit..

[9]  E. Muraki,et al.  Full-Information Item Factor Analysis , 1988 .

[10]  Janice A. Gifford,et al.  Bayesian estimation in the three-parameter logistic model , 1986 .

[11]  George Engelhard,et al.  Full-Information Item Factor Analysis: Applications of EAP Scores , 1985 .

[12]  Robert J. Mislevy,et al.  A Consumer's Guide to LOGIST and BILOG , 1987 .

[13]  Michael R. Harwell,et al.  Monte Carlo Studies in Item Response Theory , 1996 .

[14]  R. Mojena,et al.  Hierarchical Grouping Methods and Stopping Rules: An Evaluation , 1977, Comput. J..

[15]  Fritz Drasgow,et al.  Recovery of Two- and Three-Parameter Logistic Item Characteristic Curves: A Monte Carlo Study , 1982 .

[16]  Randy Elliot Bennett,et al.  ON THE MEANINGS OF CONSTRUCTED RESPONSE , 1991 .

[17]  R. Nandakumar,et al.  Empirical Validation of DIMTEST on Nonnormal Ability Distributions , 1996 .

[18]  Wendy M. Yen,et al.  A comparison of the efficiency and accuracy of BILOG and LOGIST , 1987 .

[19]  Leslie C. Morey,et al.  A Comparison of Four Clustering Methods Using MMPI Monte Carlo Data , 1980 .

[20]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[21]  L. Fisher,et al.  391: A Monte Carlo Comparison of Six Clustering Procedures , 1975 .

[22]  G. W. Milligan,et al.  A Review Of Monte Carlo Tests Of Cluster Analysis. , 1981, Multivariate behavioral research.

[23]  T. Caliński,et al.  A dendrite method for cluster analysis , 1974 .

[24]  M. Reckase Unifactor Latent Trait Models Applied to Multifactor Tests: Results and Implications , 1979 .

[25]  L. Cronbach,et al.  Assessing similarity between profiles. , 1953, Psychological bulletin.

[26]  Lloyd G. Humphreys,et al.  Three Approaches to Determining the Dimensionality of Binary Items , 1991 .

[27]  Roger K. Blashfield,et al.  Mixture model tests of cluster analysis: Accuracy of four agglomerative hierarchical methods. , 1976 .

[28]  M. Browne A comparison of factor analytic techniques , 1968, Psychometrika.

[29]  F. Baker Stability of Two Hierarchical Grouping Techniques Case I: Sensitivity to Data Errors , 1974 .

[30]  D Scheibler,et al.  Monte Carlo Tests of the Accuracy of Cluster Analysis Algorithms: A Comparison of Hierarchical and Nonhierarchical Methods. , 1985, Multivariate behavioral research.

[31]  André Hardy,et al.  An examination of procedures for determining the number of clusters in a data set , 1994 .

[32]  An Empirical Study of the Effects of Small Datasets and Varying Prior Variances on Item Parameter Estimation in BILOG , 1991 .

[33]  J. Douglas,et al.  LSAT Dimensionality Analysis for the December 1991, June 1992, and October 1992 Administrations. Statistical Report. LSAC Research Report Series. , 1999 .

[34]  R. Cattell,et al.  A general plasmode (No. 30-10-5-2) for factor analytic exercises and research. , 1967 .

[35]  William Stout,et al.  Using New Proximity Measures With Hierarchical Cluster Analysis to Detect Multidimensionality , 1998 .

[36]  C. Edelbrock Mixture Model Tests Of Hierarchical Clustering Algorithms: The Problem Of Classifying Everybody. , 1979, Multivariate behavioral research.

[37]  G. W. Milligan,et al.  An examination of the effect of six types of error perturbation on fifteen clustering algorithms , 1980 .

[38]  W. Alan Nicewander,et al.  Ability estimation for conventional tests , 1993 .

[39]  C. Edelbrock,et al.  Hierarchical Cluster Analysis Using Intraclass Correlations: A Mixture Model Study. , 1980, Multivariate behavioral research.

[40]  L. Roussos A new dimensionality estimation tool for multiple-item tests and a new DIF analysis paradigm based on multidimensionality and construct validity , 1995 .