Automatic relevance determination for multi‐way models

Estimating the adequate number of components is an important yet difficult problem in multi‐way modeling. We demonstrate how a Bayesian framework for model selection based on automatic relevance determination (ARD) can be adapted to the Tucker and CandeComp/PARAFAC (CP) models. By assigning priors for the model parameters and learning the hyperparameters of these priors the method is able to turn off excess components and simplify the core structure at a computational cost of fitting the conventional Tucker/CP model. To investigate the impact of the choice of priors we based the ARD on both Laplace and Gaussian priors corresponding to regularization by the sparsity promoting l1‐norm and the conventional l2‐norm, respectively. While the form of the priors had limited effect on the results obtained the ARD approach turned out to form a useful, simple, and efficient tool for selecting the adequate number of components of data within the Tucker and CP structure. For the Tucker and CP model the approach performs better than heuristics such as the Bayesian information criterion (BIC), Akaikes information criterion (AIC), DIFFIT and the numerical convex hull (NumConvHull) while operating only at the cost of estimating an ordinary CP/Tucker model. For the CP model the ARD approach performs almost as well as the core consistency diagnostic (CorConDiag). Thus, the ARD framework is a simple yet efficient tool for the estimation of the adequate number of components in multi‐way models. A Matlab implementation of the proposed algorithm is available for download at www.erpwavelab.org. Copyright © 2009 John Wiley & Sons, Ltd.

[1]  H. Kaiser The varimax criterion for analytic rotation in factor analysis , 1958 .

[2]  L. Tucker,et al.  Some mathematical notes on three-mode factor analysis , 1966, Psychometrika.

[3]  J. Chang,et al.  Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition , 1970 .

[4]  Richard A. Harshman,et al.  Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-model factor analysis , 1970 .

[5]  J. Kruskal Three-way arrays: rank and uniqueness of trilinear decompositions, with application to arithmetic complexity and statistics , 1977 .

[6]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[7]  David J. C. MacKay,et al.  Bayesian Interpolation , 1992, Neural Computation.

[8]  Lars Nørgaard,et al.  RANK ANNIHILATION FACTOR ANALYSIS APPLIED TO FLOW INJECTION ANALYSIS WITH PHOTODIODE-ARRAY DETECTION , 1994 .

[9]  R. Bro PARAFAC. Tutorial and applications , 1997 .

[10]  H. Kiers Joint Orthomax Rotation of the Core and Component Matrices Resulting from Three-mode Principal Components Analysis , 1998 .

[11]  Rasmus Bro,et al.  Improving the speed of multi-way algorithms:: Part I. Tucker3 , 1998 .

[12]  Rasmus Bro,et al.  Calibration methods for complex second-order data , 1999 .

[13]  R. Bro Exploratory study of sugar production using fluorescence spectroscopy and multi-way analysis , 1999 .

[14]  Joos Vandewalle,et al.  A Multilinear Singular Value Decomposition , 2000, SIAM J. Matrix Anal. Appl..

[15]  H. Kiers,et al.  Three-mode principal components analysis: choosing the numbers of components and sensitivity to local optima. , 2000, The British journal of mathematical and statistical psychology.

[16]  Joos Vandewalle,et al.  On the Best Rank-1 and Rank-(R1 , R2, ... , RN) Approximation of Higher-Order Tensors , 2000, SIAM J. Matrix Anal. Appl..

[17]  Nikos D. Sidiropoulos,et al.  Parallel factor analysis in sensor array processing , 2000, IEEE Trans. Signal Process..

[18]  Michael E. Tipping Sparse Bayesian Learning and the Relevance Vector Machine , 2001, J. Mach. Learn. Res..

[19]  Lars Kai Hansen,et al.  The Quantitative Evaluation of Functional Neuroimaging Experiments: The NPAIRS Data Analysis Framework , 2000, NeuroImage.

[20]  Matthew J. Beal Variational algorithms for approximate Bayesian inference , 2003 .

[21]  Henk A L Kiers,et al.  A fast method for choosing the numbers of components in Tucker3 analysis. , 2003, The British journal of mathematical and statistical psychology.

[22]  Rasmus Bro,et al.  Jack-knife technique for outlier detection and estimation of standard errors in PARAFAC models , 2003 .

[23]  P. Kroonenberg,et al.  Three-Mode Models and Individual Differences in Semantic Differential Data , 2003 .

[24]  R. Bro,et al.  A new efficient method for determining the number of components in PARAFAC models , 2003 .

[25]  William S Rayens,et al.  Structure-seeking multilinear methods for the analysis of fMRI data , 2004, NeuroImage.

[26]  Bülent Yener,et al.  Modeling and Multiway Analysis of Chatroom Tensors , 2005, ISI.

[27]  D. Donoho For most large underdetermined systems of linear equations the minimal 𝓁1‐norm solution is also the sparsest solution , 2006 .

[28]  Lars Kai Hansen,et al.  Adaptive regularization of noisy linear inverse problems , 2006, 2006 14th European Signal Processing Conference.

[29]  H. Kiers,et al.  Selecting among three-mode principal component models of different types and complexities: a numerical convex hull based method. , 2006, The British journal of mathematical and statistical psychology.

[30]  Lars Kai Hansen,et al.  Parallel Factor Analysis as an exploratory tool for wavelet transformed event-related EEG , 2006, NeuroImage.

[31]  G. Golub,et al.  A tensor higher-order singular value decomposition for integrative analysis of DNA microarray data from different studies , 2007, Proceedings of the National Academy of Sciences.

[32]  Lars Kai Hansen,et al.  Algorithms for Sparse Nonnegative Tucker Decompositions , 2008, Neural Computation.

[33]  R Bro,et al.  Cross-validation of component models: A critical look at current methods , 2008, Analytical and bioanalytical chemistry.

[34]  Morten Mørup,et al.  Decomposition methods for unsupervised learning , 2008 .

[35]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..