Regularized matrix data clustering and its application to image analysis

We propose a novel regularized mixture model for clustering matrix-valued data. The proposed method assumes a separable covariance structure for each cluster and imposes a sparsity structure (e.g., low rankness, spatial sparsity) for the mean signal of each cluster. We formulate the problem as a finite mixture model of matrix-normal distributions with regularization terms, and then develop an EM-type of algorithm for efficient computation. In theory, we show that the proposed estimators are strongly consistent for various choices of penalty functions. Simulation and two applications on brain signal studies confirm the excellent performance of the proposed method including a better prediction accuracy than the competitors and the scientific interpretability of the solution. This article is protected by copyright. All rights reserved.

[1]  Chee-Ming Ting,et al.  Exploratory Analysis of Brain Signals through Low Dimensional Embedding , 2019, 2019 9th International IEEE/EMBS Conference on Neural Engineering (NER).

[2]  G. W. Milligan,et al.  A Study of the Comparability of External Criteria for Hierarchical Cluster Analysis. , 1986, Multivariate behavioral research.

[3]  Hernando Ombao,et al.  Modeling High-Dimensional Multichannel Brain Signals , 2017 .

[4]  Wei Pan,et al.  Penalized Model-Based Clustering with Application to Variable Selection , 2007, J. Mach. Learn. Res..

[5]  K. Lange A gradient algorithm locally equivalent to the EM algorithm , 1995 .

[6]  Cinzia Viroli,et al.  Model based clustering for three-way data structures , 2011 .

[7]  Babak Shahbaba,et al.  Evolutionary State-Space Model and Its Application to Time-Frequency Analysis of Local Field Potentials. , 2016, Statistica Sinica.

[8]  Hernando Ombao,et al.  The Hierarchical Spectral Merger Algorithm: A New Time Series Clustering Procedure , 2016, Journal of Classification.

[9]  Hernando Ombao,et al.  Modeling Local Field Potentials with Regularized Matrix Data Clustering , 2019, 2019 9th International IEEE/EMBS Conference on Neural Engineering (NER).

[10]  P. Reiss,et al.  Functional Principal Component Regression and Functional Partial Least Squares , 2007 .

[11]  Genevera I. Allen,et al.  Convex biclustering , 2014, Biometrics.

[12]  Hernando Ombao,et al.  A Hierarchical Bayesian Model for Differential Connectivity in Multi-trial Brain Signals. , 2020, Econometrics and statistics.

[13]  Hernando Ombao,et al.  Coherence-based time series clustering for statistical inference and visualization of brain connectivity , 2019, Annals of Applied Statistics.

[14]  Salvatore Ingrassia,et al.  Constrained monotone EM algorithms for finite mixture of multivariate Gaussians , 2007, Comput. Stat. Data Anal..

[15]  Jiahua Chen Consistency of the MLE under mixture models , 2016, 1607.01251.

[16]  Padhraic Smyth,et al.  Model selection for probabilistic clustering using cross-validated likelihood , 2000, Stat. Comput..

[17]  T. C. Haas,et al.  Local Prediction of a Spatio-Temporal Process with an Application to Wet Sulfate Deposition , 1995 .

[18]  Wojtek J. Krzanowski,et al.  Improved biclustering of microarray data demonstrated through systematic performance tests , 2005, Comput. Stat. Data Anal..

[19]  Yaming Yu Monotonically Overrelaxed EM Algorithms , 2012 .

[20]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[21]  P. Green On Use of the EM Algorithm for Penalized Likelihood Estimation , 1990 .

[22]  A. Dawid Some matrix-variate distribution theory: Notational considerations and a Bayesian application , 1981 .

[23]  Ronald S. King,et al.  Cluster Analysis and Data Mining: An Introduction , 2014 .

[24]  I. Johnstone,et al.  Optimal Shrinkage of Eigenvalues in the Spiked Covariance Model. , 2013, Annals of statistics.

[25]  Cinzia Viroli,et al.  Finite mixtures of matrix normal distributions for classifying three-way data , 2011, Stat. Comput..

[26]  Hernando Ombao,et al.  Coherence-based Time Series Clustering for Brain Connectivity Visualization , 2017 .

[27]  R. Hathaway A Constrained Formulation of Maximum-Likelihood Estimation for Normal Mixture Distributions , 1985 .

[28]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[29]  Gordana Derado,et al.  Modeling the Spatial and Temporal Dependence in fMRI Data , 2010, Biometrics.

[30]  Steven C. Cramer,et al.  Estimating Brain Connectivity Using Copula Gaussian Graphical Models , 2019, 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019).

[31]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[32]  Xiao Wang,et al.  Generalized Scalar-on-Image Regression Models via Total Variation , 2017, Journal of the American Statistical Association.

[33]  Lexin Li,et al.  Regularized matrix regression , 2012, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[34]  P. Dutilleul The mle algorithm for the matrix normal distribution , 1999 .

[35]  N. Fortin,et al.  Nonspatial Sequence Coding in CA1 Neurons , 2016, The Journal of Neuroscience.

[36]  D. Hunter,et al.  Optimization Transfer Using Surrogate Objective Functions , 2000 .

[37]  M. Genton Separable approximations of space‐time covariance matrices , 2007 .

[38]  N. Cressie,et al.  Statistics for Spatial Data. , 1992 .