Existence and uniqueness of the Kronecker covariance MLE

In matrix-valued datasets the sampled matrices often exhibit correlations among both their rows and their columns. A useful and parsimonious model of such dependence is the matrix normal model, in which the covariances among the elements of a random matrix are parameterized in terms of the Kronecker product of two covariance matrices, one representing row covariances and one representing column covariance. An appealing feature of such a matrix normal model is that the Kronecker covariance structure allows for standard likelihood inference even when only a very small number of data matrices is available. For instance, in some cases a likelihood ratio test of dependence may be performed with a sample size of one. However, more generally the sample size required to ensure boundedness of the matrix normal likelihood or the existence of a unique maximizer depends in a complicated way on the matrix dimensions. This motivates the study of how large a sample size is needed to ensure that maximum likelihood estimators exist, and exist uniquely with probability one. Our main result gives precise sample size thresholds in the paradigm where the number of rows and the number of columns of the data matrices differ by at most a factor of two. Our proof uses invariance properties that allow us to consider data matrices in canonical form, as obtained from the Kronecker canonical form for matrix pencils.

[1]  Peter D. Hoff,et al.  Testing for Nodal Dependence in Relational Data Matrices , 2013, Journal of the American Statistical Association.

[2]  Kazuo Murota,et al.  Matrices and Matroids for Systems Analysis , 2000 .

[3]  Zhilin Zhang,et al.  Evolving Signal Processing for Brain–Computer Interfaces , 2012, Proceedings of the IEEE.

[4]  Tamás Rapcsák,et al.  Smooth nonlinear optimization in Rn.. (Nonconvex optimization and its applications, 19.) , 1997 .

[5]  Tamás Rapcsák,et al.  Smooth Nonlinear Optimization in Rn , 1997 .

[6]  Dietrich von Rosen,et al.  The multilinear normal distribution: Introduction and some basic properties , 2013, J. Multivar. Anal..

[7]  Peter D Hoff,et al.  SEPARABLE FACTOR ANALYSIS WITH APPLICATIONS TO MORTALITY DATA. , 2012, The annals of applied statistics.

[8]  Anja Vogler,et al.  An Introduction to Multivariate Statistical Analysis , 2004 .

[9]  Mathias Drton,et al.  The maximum likelihood threshold of a path diagram , 2018, The Annals of Statistics.

[10]  A. Dawid Some matrix-variate distribution theory: Notational considerations and a Bayesian application , 1981 .

[11]  T. W. Anderson An Introduction to Multivariate Statistical Analysis , 1959 .

[12]  J. Berge,et al.  Simplicity of core arrays in three-way principal component analysis and the typical rank of p×q×2 arrays , 1999 .

[13]  Robert Tibshirani,et al.  Inference with transposable data: modelling the effects of row and column correlations , 2010, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[14]  Ilya Soloveychik,et al.  Gaussian and robust Kronecker product covariance estimation: Existence and uniqueness , 2015, J. Multivar. Anal..

[15]  J. Landsberg Tensors: Geometry and Applications , 2011 .

[16]  R. Muirhead Aspects of Multivariate Statistical Theory , 1982, Wiley Series in Probability and Statistics.

[17]  P. Dutilleul The mle algorithm for the matrix normal distribution , 1999 .

[18]  Ami Wiesel,et al.  Geodesic Convexity and Covariance Estimation , 2012, IEEE Transactions on Signal Processing.

[19]  Seth Sullivant,et al.  The maximum likelihood threshold of a graph , 2014, 1404.6989.

[20]  Erik Elmroth,et al.  A Geometric Approach to Perturbation Theory of Matrices and Matrix Pencils. Part I: Versal Deformations , 1997 .

[21]  Mark A. Lewis,et al.  Computationally simple anisotropic lattice covariograms , 2020, Environmental and Ecological Statistics.