Searching components with simple structure in simultaneous component analysis: Blockwise Simplimax rotation

Abstract Simultaneous component analysis (SCA) is a fruitful approach to disclose the structure underlying data stemming from multiple sources on the same objects. This kind of data can be organized in blocks. To identify which component relates to all, and which to some sources, the block structure in the data should be taken into account. In this paper, we propose a new rotation criterion, Blockwise Simplimax, that aims at block simplicity of the loadings, implying that for some components all variables in a block have a zero loading. We also present an associated model selection criterion, to aid in selecting the required degree of simplicity for the data at hand. An extensive simulation study is conducted to evaluate the performance of Blockwise Simplimax and the associated model selection criterion, and to compare it with a sparse competitor, namely Sparse group SCA. In the conditions considered Blockwise Simplimax performed reasonably well, and either performed equally well as, or clearly outperformed Sparse group SCA. The model selection criterion performed well in simple conditions. The usefulness of Blockwise Simplimax and Sparse group SCA is illustrated using sensory profiling data regarding different cheeses.

[1]  M. Souli,et al.  Effects of Antimicrobial Prophylaxis on Asymptomatic Bacteriuria and Predictors of Failure in Patients with Multiple Sclerosis , 2010, Journal of chemotherapy.

[2]  H. Kiers Simplimax: Oblique rotation to an optimal target with simple structure , 1994 .

[3]  Eva Ceulemans,et al.  MultiLevel simultaneous component analysis: A computational shortcut and software package , 2016, Behavior research methods.

[4]  Eva Ceulemans,et al.  A clusterwise simultaneous component method for capturing within-cluster differences in component variances and correlations. , 2013, The British journal of mathematical and statistical psychology.

[5]  I. Mechelen,et al.  Identifying common and distinctive processes underlying multiset data , 2013 .

[6]  H. Kiers,et al.  Selecting among three-mode principal component models of different types and complexities: a numerical convex hull based method. , 2006, The British journal of mathematical and statistical psychology.

[7]  Iven Van Mechelen,et al.  On the added value of multiset methods for three-way data analysis☆ , 2013 .

[8]  H. Nielsen,et al.  Data fusion in metabolomic cancer diagnostics , 2012, Metabolomics.

[9]  Iven Van Mechelen,et al.  Component- and Factor-Based Models for Data Fusion in the Behavioral Sciences , 2015, Proceedings of the IEEE.

[10]  Rasmus Bro,et al.  Multi‐way models for sensory profiling data , 2008 .

[11]  Tom F. Wilderjans,et al.  Clusterwise Parafac to identify heterogeneity in three-way data , 2013 .

[12]  Tom F. Wilderjans,et al.  A flexible framework for sparse simultaneous component based data integration , 2011, BMC Bioinformatics.

[13]  L. Tucker A METHOD FOR SYNTHESIS OF FACTOR ANALYSIS STUDIES , 1951 .

[14]  Henk A. L. Kiers,et al.  Hierarchical relations between methods for simultaneous component analysis and a technique for rotation to a simple simultaneous structure , 1994 .

[15]  Tormod Næs,et al.  Preference mapping by PO-PLS: Separating common and unique information in several data blocks , 2012 .

[16]  J. Tukey,et al.  Multiple-Factor Analysis , 1947 .

[17]  I. Jolliffe Principal Component Analysis , 2002 .

[18]  Iven Van Mechelen,et al.  Computational Statistics and Data Analysis Algorithms for Additive Clustering of Rectangular Data Tables , 2022 .

[19]  Eva Ceulemans,et al.  UvA-DARE ( Digital Academic Repository ) Scaling in ANOVA-simultaneous component analysis , 2015 .

[20]  Eva Ceulemans,et al.  CHull: A generic convex-hull-based model selection method , 2012, Behavior Research Methods.

[21]  H. Kaiser The varimax criterion for analytic rotation in factor analysis , 1958 .

[22]  Tom F. Wilderjans,et al.  Performing DISCO-SCA to search for distinctive and common information in linked data , 2013, Behavior Research Methods.

[23]  Christian Jutten,et al.  Multimodal Data Fusion: An Overview of Methods, Challenges, and Prospects , 2015, Proceedings of the IEEE.

[24]  Iven Van Mechelen,et al.  UvA-DARE ( Digital Academic Repository ) A structured overview of simultaneous component based data integration , 2009 .

[25]  Onno E. de Noord,et al.  Multilevel component analysis and multilevel PLS of chemical process data , 2005 .

[26]  I. Mechelen,et al.  SCA with rotation to distinguish common and distinctive information in linked data , 2013, Behavior Research Methods.

[27]  Eva Ceulemans,et al.  Older adults' affective experiences across 100 days are less variable and less complex than younger adults'. , 2015, Psychology and aging.

[28]  M. Browne An Overview of Analytic Rotation in Exploratory Factor Analysis , 2001 .

[29]  Marieke E. Timmerman,et al.  Four simultaneous component models for the analysis of multivariate time series from more than one subject to model intraindividual and interindividual differences , 2003 .

[30]  Eva Ceulemans,et al.  Clusterwise simultaneous component analysis for analyzing structural differences in multivariate multiblock data. , 2012, Psychological methods.

[31]  Michael W. Browne,et al.  ORTHOGONAL ROTATION TO A PARTIALLY SPECIFIED TARGET , 1972 .

[32]  Eva Ceulemans,et al.  The CHull procedure for selecting among multilevel component solutions , 2011 .