Fast and Separable Estimation in High-Dimensional Tensor Gaussian Graphical Models

In the tensor data analysis, the Kronecker covariance structure plays a vital role in unsupervised learning and regression. Under the Kronecker covariance model assumption, the covariance of an M -way tensor is parameterized as the Kronecker product of M individual covariance matrices. With normally distributed tensors, the key to high-dimensional tensor graphical models becomes the sparse estimation of the M inverse covariance matrices. Unable to maximize the tensor normal likelihood analytically, existing approaches often require cyclic updates of the M sparse matrices. For the high-dimensional tensor graphical models, each update step solves a regularized inverse covariance estimation problem that is computationally nontrivial. This computational challenge motivates our study of whether a non-cyclic approach can be as good as the cyclic algorithms in theory and practice. To handle the potentially very high-dimensional and high-order tensors, we propose a separable and parallel estimation scheme. We show that the new estimator achieves the same minimax optimal convergence rate as the cyclic estimation approaches. Numerically, the new estimator is much faster and often more accurate than the cyclic approach. Moreover, another advantage of the separable estimation scheme is its flexibility in modeling, where we can easily incorporate user-specified or specially structured covariances on any modes of the tensor. We demonstrate the efficiency of the proposed method through both simulations and a neuroimaging application. Supplementary materials are available online.

[1]  Himanshu Gupta,et al.  APPLYING A SPATIOTEMPORAL MODEL FOR LONGITUDINAL CARDIAC IMAGING DATA. , 2016, The annals of applied statistics.

[2]  Tingting Zhang,et al.  Efficient Algorithm for Sparse Tensor-variate Gaussian Graphical Models via Gradient Descent , 2017, AISTATS.

[3]  Chenlei Leng,et al.  Sparse Matrix Graphical Models , 2012 .

[4]  Con Stough,et al.  Alcohol impairs speed of information processing and simple and choice reaction time and differentially impairs higher-order cognitive abilities , 2000 .

[5]  Jian Yang,et al.  Tensor Graphical Model: Non-convex Optimization and Statistical Inference , 2016, 1609.04522.

[6]  P. Diggle,et al.  Analysis of Longitudinal Data , 2003 .

[7]  Patrick Danaher,et al.  The joint graphical lasso for inverse covariance estimation across multiple classes , 2011, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[8]  M. Genton Separable approximations of space‐time covariance matrices , 2007 .

[9]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[10]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[11]  Hongzhe Li,et al.  Graphical model selection and estimation for high dimensional tensor data , 2014, J. Multivar. Anal..

[12]  King-Sun Fu,et al.  IEEE Transactions on Pattern Analysis and Machine Intelligence Publication Information , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Alexandre d'Aspremont,et al.  Model Selection Through Sparse Max Likelihood Estimation Model Selection Through Sparse Maximum Likelihood Estimation for Multivariate Gaussian or Binary Data , 2022 .

[14]  N. Altman,et al.  On dimension folding of matrix- or array-valued statistical objects , 2010, 1002.4789.

[15]  P. Guttorp,et al.  Studies in the history of probability and statistics XLIX On the Matérn correlation family , 2006 .

[16]  Xu Lei,et al.  Understanding the Influences of EEG Reference: A Large-Scale Brain Network Perspective , 2017, Front. Neurosci..

[17]  H. Zou,et al.  Sparse precision matrix estimation via lasso penalized D-trace loss , 2014 .

[18]  Hongtu Zhu,et al.  Tensor Regression with Applications in Neuroimaging Data Analysis , 2012, Journal of the American Statistical Association.

[19]  Annie Qu,et al.  Tensors in Statistics , 2021 .

[20]  H. Begleiter,et al.  Event related potentials during object recognition tasks , 1995, Brain Research Bulletin.

[21]  Adam J. Rothman,et al.  Shrinking characteristics of precision matrix estimators , 2017, 1704.04820.

[22]  Shuheng Zhou Gemini: Graph estimation with matrix variate normal instances , 2012, 1209.5075.

[23]  J. Ware,et al.  Applied Longitudinal Analysis , 2004 .

[24]  A. Rukhin Matrix Variate Distributions , 1999, The Multivariate Normal Distribution.

[25]  Adam J Rothman,et al.  Sparse Multivariate Regression With Covariance Estimation , 2010, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[26]  Alfred O. Hero,et al.  On Convergence of Kronecker Graphical Lasso Algorithms , 2012, IEEE Transactions on Signal Processing.

[27]  Adam J. Rothman,et al.  Sparse permutation invariant covariance estimation , 2008, 0801.4837.

[28]  Eric F. Lock,et al.  Tensor-on-Tensor Regression , 2017, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[29]  M. Yuan,et al.  Model selection and estimation in the Gaussian graphical model , 2007 .

[30]  Mathias Drton,et al.  Existence and uniqueness of the Kronecker covariance MLE , 2020, The Annals of Statistics.

[31]  I. Aban,et al.  Selecting a separable parametric spatiotemporal covariance structure for longitudinal imaging data , 2015, Statistics in medicine.

[32]  Harrison H. Zhou,et al.  Estimating Sparse Precision Matrix: Optimal Rates of Convergence and Adaptive Estimation , 2012, 1212.2882.

[33]  Hongzhe Li,et al.  Model selection and estimation in the matrix normal graphical model , 2012, J. Multivar. Anal..

[34]  J. Friedman,et al.  New Insights and Faster Computations for the Graphical Lasso , 2011 .

[35]  Yunzhang Zhu,et al.  Multiple matrix Gaussian graphs estimation , 2018, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[36]  Tamara G. Kolda,et al.  On Tensors, Sparsity, and Nonnegative Factorizations , 2011, SIAM J. Matrix Anal. Appl..

[37]  Xin Zhang,et al.  Covariate-Adjusted Tensor Classification in High Dimensions , 2018, Journal of the American Statistical Association.

[38]  Bin Yu,et al.  High-dimensional covariance estimation by minimizing ℓ1-penalized log-determinant divergence , 2008, 0811.3628.

[39]  Peter D. Hoff,et al.  MULTILINEAR TENSOR REGRESSION FOR LONGITUDINAL RELATIONAL DATA. , 2014, The annals of applied statistics.

[40]  R. Pfeiffer,et al.  Least squares and maximum likelihood estimation of sufficient reductions in regressions with matrix-valued predictors , 2020, International Journal of Data Science and Analytics.