Finite mixtures of matrix normal distributions for classifying three-way data

Matrix-variate distributions represent a natural way for modeling random matrices. Realizations from random matrices are generated by the simultaneous observation of variables in different situations or locations, and are commonly arranged in three-way data structures. Among the matrix-variate distributions, the matrix normal density plays the same pivotal role as the multivariate normal distribution in the family of multivariate distributions. In this work we define and explore finite mixtures of matrix normals. An EM algorithm for the model estimation is developed and some useful properties are demonstrated. We finally show that the proposed mixture model can be a powerful tool for classifying three-way data both in supervised and unsupervised problems. A simulation study and some real examples are presented.

[1]  Gérard Govaert,et al.  Gaussian parsimonious clustering models , 1995, Pattern Recognit..

[2]  Geoffrey J. McLachlan,et al.  The mixture method of clustering applied to three-way data , 1985 .

[3]  A. Montanari,et al.  Heteroscedastic factor mixture analysis , 2010 .

[4]  Lynette A. Hunt,et al.  Fitting a Mixture Model to Three-Mode Three-Way Data with Categorical and Continuous Variables , 1999 .

[5]  Maurizio Vichi,et al.  Simultaneous Component and Clustering Models for Three-way Data: Within and Between Approaches , 2007, J. Classif..

[6]  D. E. Byth,et al.  Genotype × environment interactions and environmental adaptation. I. Pattern analysis — application to soya bean populations , 1974 .

[7]  Geoffrey J. McLachlan,et al.  Mixture models : inference and applications to clustering , 1989 .

[8]  Maurizio Vichi One-Mode Classification of a Three-Way Data Matrix , 1999 .

[9]  Yiu-Fai Yung,et al.  Finite mixtures in confirmatory factor-analysis models , 1997 .

[10]  J. Wolfe PATTERN CLUSTERING BY MULTIVARIATE MIXTURE ANALYSIS. , 1970, Multivariate behavioral research.

[11]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[12]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[13]  Jeroen K. Vermunt A hierarchical mixture model for clustering three-way data sets , 2007, Comput. Stat. Data Anal..

[14]  R. Tibshirani,et al.  Discriminant Analysis by Gaussian Mixtures , 1996 .

[15]  Geoffrey J. McLachlan,et al.  Modelling high-dimensional data by mixtures of factor analyzers , 2003, Comput. Stat. Data Anal..

[16]  A. Raftery,et al.  Model-based Gaussian and non-Gaussian clustering , 1993 .

[17]  H. Joe Generating random correlation matrices based on partial correlations , 2006 .

[18]  Jeroen K. Vermunt,et al.  7. Multilevel Latent Class Models , 2003 .

[19]  Adrian E. Raftery,et al.  Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .

[20]  Geoffrey J. McLachlan,et al.  Discriminant Analysis and Statistical Pattern Recognition: McLachlan/Discriminant Analysis & Pattern Recog , 2005 .

[21]  G. J. McLachlan,et al.  9 The classification and mixture maximum likelihood approaches to cluster analysis , 1982, Classification, Pattern Recognition and Reduction of Dimensionality.

[22]  Adrian E. Raftery,et al.  Enhanced Model-Based Clustering, Density Estimation, and Discriminant Analysis Software: MCLUST , 2003, J. Classif..

[23]  A. D. Gordon,et al.  Partitions of Partitions , 1998 .

[24]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[25]  Gérard Govaert,et al.  Assessing a Mixture Model for Clustering with the Integrated Completed Likelihood , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Cordelia Schmid,et al.  High-dimensional data clustering , 2006, Comput. Stat. Data Anal..

[27]  Shuicheng Yan,et al.  Matrix-Variate Factor Analysis and Its Applications , 2008, IEEE Transactions on Neural Networks.

[28]  G. McLachlan Discriminant Analysis and Statistical Pattern Recognition , 1992 .

[29]  David W. Scott,et al.  Multivariate Density Estimation: Theory, Practice, and Visualization , 1992, Wiley Series in Probability and Statistics.

[30]  Adrian E. Raftery,et al.  MCLUST: Software for Model-Based Cluster Analysis , 1999 .

[31]  Robin Sibson,et al.  What is projection pursuit , 1987 .

[32]  P. Dutilleul The mle algorithm for the matrix normal distribution , 1999 .

[33]  Wei-Chien Chang On using Principal Components before Separating a Mixture of Two Multivariate Normal Distributions , 1983 .

[34]  L. Billard,et al.  From the Statistics of Data to the Statistics of Knowledge , 2003 .

[35]  Geoffrey J. McLachlan,et al.  A case study of two clustering methods based on maximum likelihood , 1979 .

[36]  A. Rukhin Matrix Variate Distributions , 1999, The Multivariate Normal Distribution.

[37]  A. Dawid Some matrix-variate distribution theory: Notational considerations and a Bayesian application , 1981 .