Robust estimator of the correlation matrix with sparse Kronecker structure for a high-dimensional matrix-variate

Abstract It is of interest in many applications to estimate the correlation matrix of a high dimensional matrix-variate X ∈ R p × q . Existing works usually impose strong assumptions on the distribution of X such as sub-Gaussian or strong moment conditions. These assumptions may be violated easily in practice and a robust estimator is desired. In this paper, we consider the case where the correlation matrix has a sparse Kronecker structure and propose a robust estimator based on Kendall’s τ correlation. The proposed estimator is extended further to tensor data. The theoretical properties of the estimator are established, showing that Kronecker structure actually increases the effective sample size and leads to a fast convergence rate. Finally, we apply the proposed estimator to bigraphical model, obtaining an estimator of better convergence rate than the existing results. Simulations and real data analysis confirm the competitiveness of the proposed method.

[1]  Harald H. Sitte,et al.  Imaging genetics of mood disorders , 2010, NeuroImage.

[2]  Terrence J. Sejnowski,et al.  Enhanced detection of artifacts in EEG data using higher-order statistics and independent component analysis , 2007, NeuroImage.

[3]  Petre Stoica,et al.  On Estimation of Covariance Matrices With Kronecker Product Structure , 2008, IEEE Transactions on Signal Processing.

[4]  M. Fox,et al.  Spontaneous fluctuations in brain activity observed with functional magnetic resonance imaging , 2007, Nature Reviews Neuroscience.

[5]  P. Dutilleul The mle algorithm for the matrix normal distribution , 1999 .

[6]  Larry A. Wasserman,et al.  High Dimensional Semiparametric Gaussian Copula Graphical Models. , 2012, ICML 2012.

[7]  Lexin Li,et al.  Regularized matrix regression , 2012, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[8]  Paul M. Thompson,et al.  Genetics of the connectome , 2013, NeuroImage.

[9]  R. Serfling Approximation Theorems of Mathematical Statistics , 1980 .

[10]  James R. Schott,et al.  Testing for elliptical symmetry in covariance-matrix-based analyses , 2002 .

[11]  Hongtu Zhu,et al.  L2RM: Low-Rank Linear Regression Models for High-Dimensional Matrix Responses , 2019, Journal of the American Statistical Association.

[12]  Hongzhe Li,et al.  Model selection and estimation in the matrix normal graphical model , 2012, J. Multivar. Anal..

[13]  Mladen Kolar,et al.  ROCKET: Robust Confidence Intervals via Kendall's Tau for Transelliptical Graphical Models , 2015, The Annals of Statistics.

[14]  Roman Vershynin,et al.  Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.

[15]  P. Bickel,et al.  Regularized estimation of large covariance matrices , 2008, 0803.1909.

[16]  Hongzhe Li,et al.  Graphical model selection and estimation for high dimensional tensor data , 2014, J. Multivar. Anal..

[17]  Han Liu,et al.  Optimal Rates of Convergence of Transelliptical Component Analysis , 2013 .

[18]  A. Quiroz,et al.  A Statistic for Testing the Null Hypothesis of Elliptical Symmetry , 2002 .

[19]  Neda Jahanshad,et al.  Whole-genome analyses of whole-brain data: working within an expanded search space , 2014, Nature Neuroscience.

[20]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[21]  Ming Li,et al.  2D-LDA: A statistical linear discriminant analysis for image matrix , 2005, Pattern Recognit. Lett..

[22]  Han Liu,et al.  Scale-Invariant Sparse PCA on High-Dimensional Meta-Elliptical Data , 2014, Journal of the American Statistical Association.

[23]  Han Liu,et al.  High-dimensional semiparametric bigraphical models , 2013 .

[24]  Chenlei Leng,et al.  Covariance estimation via sparse Kronecker structures , 2018, Bernoulli.

[25]  N. Altman,et al.  On dimension folding of matrix- or array-valued statistical objects , 2010, 1002.4789.

[26]  Gordon K. Smyth,et al.  limma: Linear Models for Microarray Data , 2005 .

[27]  Hung Hung,et al.  Matrix variate logistic regression model with application to EEG data. , 2011, Biostatistics.

[28]  Chenlei Leng,et al.  Sparse Matrix Graphical Models , 2012 .

[29]  Junlong Zhao,et al.  High dimensional semiparametric estimate of latent covariance matrix for matrix-variate , 2019, Statistica Sinica.

[30]  Cheolyong Park,et al.  A test for elliptical symmetry , 2007 .

[31]  Haiyan Huang,et al.  A Statistical Framework to Infer Functional Gene Relationships From Biologically Interrelated Microarray Experiments , 2009 .

[32]  Ying Cui,et al.  Sparse estimation of high-dimensional correlation matrices , 2016, Comput. Stat. Data Anal..

[33]  Lixing Zhu,et al.  Conditional tests for elliptical symmetry , 2003 .

[34]  A. Hero,et al.  Convergence Properties of Kronecker Graphical Lasso Algorithms , 2013 .

[35]  A. Owen,et al.  AGEMAP: A Gene Expression Database for Aging in Mice , 2007, PLoS genetics.

[36]  Multivariate Data Analysis of In Situ Pulp Kinetics Using 13 C CP/MAS NMR , 1989 .

[37]  Jun Zhang,et al.  Robust rank correlation based screening , 2010, 1012.4255.

[38]  Alfred O. Hero,et al.  Covariance Estimation in High Dimensions Via Kronecker Product Expansions , 2013, IEEE Transactions on Signal Processing.

[39]  Wicher P. Bergsma,et al.  A consistent test of independence based on a sign covariance related to Kendall's tau , 2010, 1007.4259.

[40]  Peter D. Hoff,et al.  Separable covariance arrays via the Tucker product, with applications to multivariate relational data , 2010, 1008.2169.