Equivariant and Scale-Free Tucker Decomposition Models

Analyses of array-valued datasets often involve reduced-rank array approximations, typically obtained via least-squares or truncations of array decompositions. However, least-squares approximations tend to be noisy in high-dimensional settings, and may not be appropriate for arrays that include discrete or ordinal measurements. This article develops methodology to obtain low-rank model-based representations of continuous, discrete and ordinal data arrays. The model is based on a parameterization of the mean array as a multilinear product of a reduced-rank core array and a set of index-specific orthogonal eigenvector matrices. It is shown how orthogonally equivariant parameter estimates can be obtained from Bayesian procedures under invariant prior distributions. Additionally, priors on the core array are developed that act as regularizers, leading to improved inference over the standard least-squares estimator, and providing robustness to misspecification of the array rank. This model-based approach is extended to accommodate discrete or ordinal data arrays using a semiparametric transformation model. The resulting low-rank representation is scale-free, in the sense that it is invariant to monotonic transformations of the data array. In an example analysis of a multivariate discrete network dataset, this scale-free approach provides a more complete description of data patterns.

[1]  Robert Tibshirani,et al.  Spectral Regularization Algorithms for Learning Large Incomplete Matrices , 2010, J. Mach. Learn. Res..

[2]  Peter D. Hoff,et al.  Separable covariance arrays via the Tucker product, with applications to multivariate relational data , 2010, 1008.2169.

[3]  Peter D Hoff,et al.  Testing and Modeling Dependencies Between a Network and Nodal Attributes , 2013, Journal of the American Statistical Association.

[4]  Peter D. Hoff Extending the rank likelihood for semiparametric copula estimation , 2006, math/0610413.

[5]  Zenglin Xu,et al.  Infinite Tucker Decomposition: Nonparametric Bayesian Models for Multiway Data Analysis , 2011, ICML.

[6]  Genevera I. Allen Regularized Tensor Factorizations and Higher-Order Principal Components Analysis , 2012, 1202.2476.

[7]  Philip A. Schrodt,et al.  The CAMEO (Conflict and Mediation Event Observations) Actor Coding Framework , 2008 .

[8]  Michael D. Ward,et al.  Persistent Patterns of International Commerce , 2007 .

[9]  M. Goldman,et al.  Nucleation barriers at corners for a cubic-to-tetragonal phase transformation , 2013, Proceedings of the Royal Society of Edinburgh: Section A Mathematics.

[10]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[11]  John Beieler,et al.  Improving Forecasts of International Events of Interest , 2013 .

[12]  Bo Huang,et al.  Square Deal: Lower Bounds and Improved Relaxations for Tensor Recovery , 2013, ICML.

[13]  M. L. Eaton Group invariance applications in statistics , 1989 .

[14]  L. Tucker,et al.  Some mathematical notes on three-mode factor analysis , 1966, Psychometrika.

[15]  Joos Vandewalle,et al.  A Multilinear Singular Value Decomposition , 2000, SIAM J. Matrix Anal. Appl..

[16]  Wei Chu,et al.  Probabilistic Models for Incomplete Multi-dimensional Arrays , 2009, AISTATS.

[17]  Peter D. Ho Simulation of the matrix Bingham-von Mises-Fisher distribution, with applications to multivariate and relational data , 2008 .

[18]  A. Volfovsky,et al.  Hierarchical array priors for ANOVA decompositions , 2012 .

[19]  Jieping Ye,et al.  Tensor Completion for Estimating Missing Values in Visual Data , 2013, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[21]  D. Dunson,et al.  Simplex Factor Models for Multivariate Unordered Categorical Data , 2012, Journal of the American Statistical Association.

[22]  Peter D. Hoff,et al.  Simulation of the Matrix Bingham–von Mises–Fisher Distribution, With Applications to Multivariate and Relational Data , 2007, 0712.4166.

[23]  Hisashi Kashima,et al.  Statistical Performance of Convex Tensor Decomposition , 2011, NIPS.