Cluster Analysis for Cognitive Diagnosis: Theory and Applications

Latent class models for cognitive diagnosis often begin with specification of a matrix that indicates which attributes or skills are needed for each item. Then by imposing restrictions that take this into account, along with a theory governing how subjects interact with items, parametric formulations of item response functions are derived and fitted. Cluster analysis provides an alternative approach that does not require specifying an item response model, but does require an item-by-attribute matrix. After summarizing the data with a particular vector of sum-scores, K-means cluster analysis or hierarchical agglomerative cluster analysis can be applied with the purpose of clustering subjects who possess the same skills. Asymptotic classification accuracy results are given, along with simulations comparing effects of test length and method of clustering. An application to a language examination is provided to illustrate how the methods can be implemented in practice.

[1]  E. Forgy,et al.  Cluster analysis of multivariate data : efficiency versus interpretability of classifications , 1965 .

[2]  Matthias von Davier,et al.  A GENERAL DIAGNOSTIC MODEL APPLIED TO LANGUAGE TESTING DATA , 2005 .

[3]  David Pollard,et al.  Quantization and the method of k -means , 1982, IEEE Trans. Inf. Theory.

[4]  Brian Everitt,et al.  Cluster analysis , 1974 .

[5]  Girish N. Punj,et al.  Cluster Analysis in Marketing Research: Review and Suggestions for Application , 1983 .

[6]  R. Hambleton,et al.  Handbook of Modern Item Response Theory , 1997 .

[7]  Curtis Tatsuoka,et al.  Data analytic methods for latent partially ordered classification models , 2002 .

[8]  Douglas Steinley,et al.  K-means clustering: a half-century synthesis. , 2006, The British journal of mathematical and statistical psychology.

[9]  Susan E. Embretson,et al.  Multicomponent Response Models , 1997 .

[10]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[11]  Louis Roussos,et al.  THE FUSION MODEL FOR SKILLS DIAGNOSIS: BLENDING THEORY WITH PRACTICALITY , 2008 .

[12]  Ying Liu,et al.  Testing Person Fit in Cognitive Diagnosis , 2009 .

[13]  D. Bartholomew Latent Variable Models And Factor Analysis , 1987 .

[14]  Edward H. Haertel Using restricted latent class models to map the skill structure of achievement items , 1989 .

[15]  C. Mitchell Dayton,et al.  The Use of Probabilistic Models in the Assessment of Mastery , 1977 .

[16]  G. W. Milligan,et al.  An examination of the effect of six types of error perturbation on fifteen clustering algorithms , 1980 .

[17]  J. Templin,et al.  Measurement of psychological disorders using cognitive diagnosis models. , 2006, Psychological methods.

[18]  E. Maris Estimating multiple classification latent class models , 1999 .

[19]  Pedro Larrañaga,et al.  An empirical comparison of four initialization methods for the K-Means algorithm , 1999, Pattern Recognit. Lett..

[20]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[21]  Douglas Steinley,et al.  Local optima in K-means clustering: what you don't know may hurt you. , 2003, Psychological methods.

[22]  L. Fisher,et al.  391: A Monte Carlo Comparison of Six Clustering Procedures , 1975 .

[23]  Paul S. Bradley,et al.  Refining Initial Points for K-Means Clustering , 1998, ICML.

[24]  B. Junker,et al.  Cognitive Assessment Models with Few Assumptions, and Connections with Nonparametric Item Response Theory , 2001 .

[25]  John A. Hartigan,et al.  Clustering Algorithms , 1975 .

[26]  B. Everitt,et al.  A Monte Carlo Study of the Recovery of Cluster Structure in Binary Data by Hierarchical Clustering Techniques. , 1987, Multivariate behavioral research.

[27]  P. Green,et al.  Analyzing multivariate data , 1978 .

[28]  D. Pollard Strong Consistency of $K$-Means Clustering , 1981 .

[29]  Matthias von Davier,et al.  A General Diagnostic Model Applied to Language Testing Data. Research Report. ETS RR-05-16. , 2005 .

[30]  Roger K. Blashfield,et al.  Mixture model tests of cluster analysis: Accuracy of four agglomerative hierarchical methods. , 1976 .

[31]  John C. Ogilvie,et al.  Evaluation of hierarchical grouping techniques; a preliminary study , 1972, Comput. J..

[32]  Kikumi K. Tatsuoka,et al.  A Probabilistic Model for Diagnosing Misconceptions By The Pattern Classification Approach , 1985 .

[33]  Jeffrey A Douglas,et al.  Higher-order latent trait models for cognitive diagnosis , 2004 .

[34]  J. Hartigan Asymptotic Distributions for Clustering Criteria , 1978 .

[35]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[36]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .