Learning Attribute Patterns in High-Dimensional Structured Latent Attribute Models

Structured latent attribute models (SLAMs) are a special family of discrete latent variable models widely used in social and biological sciences. This paper considers the problem of learning significant attribute patterns from a SLAM with potentially high-dimensional configurations of the latent attributes. We address the theoretical identifiability issue, propose a penalized likelihood method for the selection of the attribute patterns, and further establish the selection consistency in such an overfitted SLAM with diverging number of latent patterns. The good performance of the proposed methodology is illustrated by simulation studies and two real datasets in educational assessment.

[1]  Young-sun Lee,et al.  A Cognitive Diagnostic Modeling of Attribute Mastery in Massachusetts, Minnesota, and the U.S. National Sample Using the TIMSS 2007 , 2011 .

[2]  Kensuke Okada,et al.  Comparison among cognitive diagnostic models for the TIMSS 2007 fourth grade mathematics assessment , 2018, PloS one.

[3]  E. Maris Estimating multiple classification latent class models , 1999 .

[4]  J. Kalbfleisch,et al.  A modified likelihood ratio test for homogeneity in finite mixture models , 2001 .

[5]  Gunter Maris,et al.  Equivalent Diagnostic Classification Models , 2009 .

[6]  Jeffrey A Douglas,et al.  Higher-order latent trait models for cognitive diagnosis , 2004 .

[7]  Jonathan Templin,et al.  Diagnostic Measurement: Theory, Methods, and Applications , 2010 .

[8]  S. Zeger,et al.  A Bayesian Approach to Restricted Latent Class Models for Scientifically-Structured Clustering of Multivariate Binary Outcomes , 2018, bioRxiv.

[9]  Yuguo Chen,et al.  Bayesian Estimation of the DINA Q matrix , 2018, Psychometrika.

[10]  Gongjun Xu,et al.  Identifying Latent Structures in Restricted Latent Class Models , 2018, Journal of the American Statistical Association.

[11]  Matthias von Davier,et al.  The DINA model as a constrained general diagnostic model: Two variants of a model equivalency. , 2014 .

[12]  Pierre Alquier,et al.  Consistency of variational Bayes inference for estimation and model selection in mixtures , 2018, 1805.05054.

[13]  Laine Bradshaw,et al.  Hierarchical Diagnostic Classification Models: A Family of Models for Estimating and Testing Attribute Hierarchies , 2014, Psychometrika.

[14]  John T. Willse,et al.  Defining a Family of Cognitive Diagnosis Models Using Log-Linear Models with Latent Variables , 2009 .

[15]  Jiahua Chen Optimal Rate of Convergence for Finite Mixture Models , 1995 .

[16]  Miguel Á. Carreira-Perpiñán,et al.  Practical Identifiability of Finite Mixtures of Multivariate Bernoulli Distributions , 2000, Neural Computation.

[17]  Anima Anandkumar,et al.  Tensor decompositions for learning latent variable models , 2012, J. Mach. Learn. Res..

[18]  Mark J. Gierl,et al.  The Attribute Hierarchy Method for Cognitive Assessment: A Variation on Tatsuoka's Rule-Space Approach , 2004 .

[19]  Xiaotong Shen,et al.  Journal of the American Statistical Association Likelihood-based Selection and Sharp Parameter Estimation Likelihood-based Selection and Sharp Parameter Estimation , 2022 .

[20]  Jimmy de la Torre,et al.  Analysis of Clinical Data From Cognitive Diagnosis Modeling Framework , 2015 .

[21]  David M. Blei,et al.  Variational Inference: A Review for Statisticians , 2016, ArXiv.

[22]  Gongjun Xu,et al.  Identifiability of restricted latent class models with binary responses , 2016, 1603.04140.

[23]  Nhat Ho,et al.  Convergence rates of parameter estimation for some weakly identifiable finite mixtures , 2016 .

[24]  Gongjun Xu,et al.  Identifiability of Diagnostic Classification Models , 2015, Psychometrika.

[25]  J. D. L. Torre,et al.  The Generalized DINA Model Framework. , 2011 .

[26]  Zhenke Wu,et al.  Partial Identifiability of Restricted Latent Class Models , 2018 .

[27]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[28]  J. Kruskal Three-way arrays: rank and uniqueness of trilinear decompositions, with application to arithmetic complexity and statistics , 1977 .

[29]  Matthias von Davier,et al.  A general diagnostic model applied to language testing data. , 2008, The British journal of mathematical and statistical psychology.

[30]  Yuval Kluger,et al.  Learning Binary Latent Variable Models: A Tensor Eigenpair Approach , 2018, ICML.

[31]  Ivo Düntsch,et al.  Skills and knowledge structures , 1995 .

[32]  K. Mengersen,et al.  Asymptotic behaviour of the posterior distribution in overfitted mixture models , 2011 .

[33]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[34]  Jiahua Chen,et al.  Extended Bayesian information criteria for model selection with large model spaces , 2008 .

[35]  What CDM Can Tell About What Students Have Learned: An Analysis of TIMSS Eighth Grade Mathematics , 2015 .

[36]  J. Kahn,et al.  Strong identifiability and optimal minimax rates for finite mixture estimation , 2018, The Annals of Statistics.

[37]  C. Holmes,et al.  Assigning a value to a power likelihood in a general Bayesian model , 2017, 1701.08515.

[38]  W. Wong,et al.  Probability inequalities for likelihood ratios and convergence rates of sieve MLEs , 1995 .

[39]  Gongjun Xu,et al.  The Sufficient and Necessary Condition for the Identifiability and Estimability of the DINA Model , 2017, Psychometrika.

[40]  Paul Smolensky,et al.  Information processing in dynamical systems: foundations of harmony theory , 1986 .

[41]  L. T. DeCarlo On the Analysis of Fraction Subtraction Data: The DINA Model, Classification, Latent Class Sizes, and the Q-Matrix , 2011 .

[42]  Scott L Zeger,et al.  Nested partially latent class models for dependent binary data; estimating disease etiology. , 2015, Biostatistics.

[43]  Pier Giovanni Bissiri,et al.  A general framework for updating belief distributions , 2013, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[44]  J. Templin,et al.  Measurement of psychological disorders using cognitive diagnosis models. , 2006, Psychological methods.

[45]  Jimmy de la Torre,et al.  Analysis of Clinical Data From a Cognitive Diagnosis Modeling Framework , 2018 .

[46]  L. Wasserman,et al.  RATES OF CONVERGENCE FOR THE GAUSSIAN MIXTURE SIEVE , 2000 .

[47]  M. Verlaan,et al.  Non-uniqueness in probabilistic numerical identification of bacteria , 1994, Journal of Applied Probability.

[48]  Sham M. Kakade,et al.  Learning mixtures of spherical gaussians: moment methods and spectral decompositions , 2012, ITCS '13.

[49]  C. Matias,et al.  Identifiability of parameters in latent structure models with many observed variables , 2008, 0809.5032.

[50]  Anima Anandkumar,et al.  When are overcomplete topic models identifiable? uniqueness of tensor tucker decompositions with structured sparsity , 2013, J. Mach. Learn. Res..

[51]  John D. Kalbfleisch,et al.  Testing for a finite mixture model with two components , 2004 .

[52]  M. Drton,et al.  Algebraic factor analysis: tetrads, pentads and beyond , 2005, math/0509390.

[53]  B. Junker,et al.  Cognitive Assessment Models with Few Assumptions, and Connections with Nonparametric Item Response Theory , 2001 .

[54]  Aditya Bhaskara,et al.  Uniqueness of Tensor Decompositions with Applications to Polynomial Identifiability , 2013, COLT.