On the added value of multiset methods for three-way data analysis☆

Abstract Three-way three-mode data are collected regularly in scientific research and yield information on the relation between three sets of entities. To summarize the information in such data, three-way component methods like CANDECOMP/PARAFAC (CP) and Tucker3 are often used. When applying CP and Tucker3 in empirical practice, one should be cautious, however, because they rely on very strict assumptions. We argue that imposing these assumptions may obscure interesting structural information included in the data and may lead to substantive conclusions that are appropriate for some part of the data only. As a way out, this paper demonstrates that this structural information may be elegantly captured by means of component methods for multiset data, that is to say, simultaneous component analysis (SCA) and its clusterwise extension (clusterwise SCA).

[1]  Eva Ceulemans,et al.  CHull as an alternative to AIC and BIC in the context of mixtures of factor analyzers , 2013, Behavior research methods.

[2]  Iven Van Mechelen,et al.  Comparability problems in the analysis of multiway data , 2011 .

[3]  Claus A. Andersson,et al.  PARAFAC2—Part II. Modeling chromatographic data with retention time shifts , 1999 .

[4]  H. Kiers,et al.  Three-mode principal components analysis: choosing the numbers of components and sensitivity to local optima. , 2000, The British journal of mathematical and statistical psychology.

[5]  Eva Ceulemans,et al.  A clusterwise simultaneous component method for capturing within-cluster differences in component variances and correlations. , 2013, The British journal of mathematical and statistical psychology.

[6]  L. Tucker A METHOD FOR SYNTHESIS OF FACTOR ANALYSIS STUDIES , 1951 .

[7]  J. Berge,et al.  Tucker's congruence coefficient as a meaningful index of factor similarity. , 2006 .

[8]  Richard A. Harshman,et al.  Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-model factor analysis , 1970 .

[9]  P. Kroonenberg Applied Multiway Data Analysis , 2008 .

[10]  Eva Ceulemans,et al.  How to perform multiblock component analysis in practice , 2011, Behavior Research Methods.

[11]  Rasmus Bro,et al.  The N-way Toolbox for MATLAB , 2000 .

[12]  Marieke E. Timmerman,et al.  Four simultaneous component models for the analysis of multivariate time series from more than one subject to model intraindividual and interindividual differences , 2003 .

[13]  Eva Ceulemans,et al.  Modeling Differences in the Dimensionality of Multiblock Data by Means of Clusterwise Simultaneous Component Analysis , 2013, Psychometrika.

[14]  Eva Ceulemans,et al.  CHull: A generic convex-hull-based model selection method , 2012, Behavior Research Methods.

[15]  Rasmus Bro,et al.  Multi‐way models for sensory profiling data , 2008 .

[16]  R. A. Harshman,et al.  Data preprocessing and the extended PARAFAC model , 1984 .

[17]  Marieke E Timmerman,et al.  Multilevel component analysis. , 2006, The British journal of mathematical and statistical psychology.

[18]  L. Tucker,et al.  Some mathematical notes on three-mode factor analysis , 1966, Psychometrika.

[19]  J. Chang,et al.  Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition , 1970 .

[20]  F. L. Hitchcock The Expression of a Tensor or a Polyadic as a Sum of Products , 1927 .

[21]  H. Kiers,et al.  Discriminating between strong and weak structures in three-mode principal component analysis. , 2009, The British journal of mathematical and statistical psychology.

[22]  H. Kiers,et al.  Selecting among three-mode principal component models of different types and complexities: a numerical convex hull based method. , 2006, The British journal of mathematical and statistical psychology.

[23]  R. A. van den Berg,et al.  Centering, scaling, and transformations: improving the biological information content of metabolomics data , 2006, BMC Genomics.

[24]  R. Bro,et al.  Centering and scaling in component analysis , 2003 .

[25]  Eva Ceulemans,et al.  The CHull procedure for selecting among multilevel component solutions , 2011 .

[26]  A. Stegeman On uniqueness conditions for Candecomp/Parafac and Indscal with full column rank in one mode , 2009 .

[27]  R. Bro,et al.  A new efficient method for determining the number of components in PARAFAC models , 2003 .

[28]  Eva Ceulemans,et al.  Clusterwise simultaneous component analysis for analyzing structural differences in multivariate multiblock data. , 2012, Psychological methods.