Modeling Differences in the Dimensionality of Multiblock Data by Means of Clusterwise Simultaneous Component Analysis

Given multivariate multiblock data (e.g., subjects nested in groups are measured on multiple variables), one may be interested in the nature and number of dimensions that underlie the variables, and in differences in dimensional structure across data blocks. To this end, clusterwise simultaneous component analysis (SCA) was proposed which simultaneously clusters blocks with a similar structure and performs an SCA per cluster. However, the number of components was restricted to be the same across clusters, which is often unrealistic. In this paper, this restriction is removed. The resulting challenges with respect to model estimation and selection are resolved.

[1]  B. Everitt,et al.  A Monte Carlo Study of the Recovery of Cluster Structure in Binary Data by Hierarchical Clustering Techniques. , 1987, Multivariate behavioral research.

[2]  Shokri Z. Selim,et al.  K-Means-Type Algorithms: A Generalized Convergence Theorem and Characterization of Local Optimality , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  M. Scheier,et al.  Public and private self-consciousness: Assessment and theory. , 1975 .

[4]  H. Kiers,et al.  Three-mode principal components analysis: choosing the numbers of components and sensitivity to local optima. , 2000, The British journal of mathematical and statistical psychology.

[5]  J. Nezlek,et al.  Distinguishing affective and non-affective reactions to daily events. , 2005, Journal of personality.

[6]  Eva Ceulemans,et al.  How to perform multiblock component analysis in practice , 2011, Behavior Research Methods.

[7]  Eva Ceulemans,et al.  The CHull procedure for selecting among multilevel component solutions , 2011 .

[8]  Arthur E. Hoerl,et al.  Application of ridge analysis to regression problems , 1962 .

[9]  T. Berge Least squares optimization in multivariate analysis , 2005 .

[10]  John B. Nezlek,et al.  Diary Methods for Social and Personality Psychology , 2012 .

[11]  R. Díaz-Loving Contributions of Mexican Ethnopsychology to the Resolution of the Etic-Emic Dilemma in Personality , 1998 .

[12]  Eva Ceulemans,et al.  Factorial and reduced K-means reconsidered , 2010, Comput. Stat. Data Anal..

[13]  Jacob Cohen Measurement Educational and Psychological Educational and Psychological Measurement Eta-squared and Partial Eta-squared in Fixed Factor Anova Designs Educational and Psychological Measurement Additional Services and Information For , 2022 .

[14]  P. Robert,et al.  A Unifying Tool for Linear Multivariate Statistical Methods: The RV‐Coefficient , 1976 .

[15]  Age K Smilde,et al.  Bootstrap confidence intervals in multi-level simultaneous component analysis. , 2009, The British journal of mathematical and statistical psychology.

[16]  Ian T. Jolliffe,et al.  Principal Component Analysis , 2002, International Encyclopedia of Statistical Science.

[17]  G. W. Milligan,et al.  The Effect of Cluster Size, Dimensionality, and the Number of Clusters on Recovery of True Cluster Structure , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[19]  Karl Pearson F.R.S. LIII. On lines and planes of closest fit to systems of points in space , 1901 .

[20]  M. Brusco,et al.  A variable-selection heuristic for K-means clustering , 2001 .

[21]  P. Trapnell,et al.  Private self-consciousness and the five-factor model of personality: distinguishing rumination from reflection. , 1999, Journal of personality and social psychology.

[22]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[23]  Iven Van Mechelen,et al.  Hierarchical classes models for three-way three-mode binary data: interrelations and model selection , 2005 .

[24]  Eva Ceulemans,et al.  A clusterwise simultaneous component method for capturing within-cluster differences in component variances and correlations. , 2013, The British journal of mathematical and statistical psychology.

[25]  H. Kiers,et al.  Selecting among three-mode principal component models of different types and complexities: a numerical convex hull based method. , 2006, The British journal of mathematical and statistical psychology.

[26]  M. Brusco,et al.  ConPar: a method for identifying groups of concordant subject proximity matrices for subsequent multidimensional scaling analyses , 2005 .

[27]  Douglas Steinley,et al.  Local optima in K-means clustering: what you don't know may hurt you. , 2003, Psychological methods.

[28]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[29]  Tom F. Wilderjans,et al.  A flexible framework for sparse simultaneous component based data integration , 2011, BMC Bioinformatics.

[30]  Iven Van Mechelen,et al.  A generic linked-mode decomposition model for data fusion , 2010 .

[31]  B. Fredrickson,et al.  Psychological resilience and positive emotional granularity: examining the benefits of positive emotions on coping and health. , 2004, Journal of personality.

[32]  Iven Van Mechelen,et al.  On the Added Value of Bootstrap Analysis for K-Means Clustering , 2015, Journal of Classification.

[33]  Age K. Smilde,et al.  Real-life metabolomics data analysis : how to deal with complex data ? , 2010 .

[34]  H. Kiers,et al.  Discriminating between strong and weak structures in three-mode principal component analysis. , 2009, The British journal of mathematical and statistical psychology.

[35]  Yiu-Fai Yung,et al.  Finite mixtures in confirmatory factor-analysis models , 1997 .

[36]  J. M. Digman PERSONALITY STRUCTURE: EMERGENCE OF THE FIVE-FACTOR MODEL , 1990 .

[37]  R. A. van den Berg,et al.  Simultaneous analysis of coupled data matrices subject to different amounts of noise. , 2011, The British journal of mathematical and statistical psychology.

[38]  H. Akaike A new look at the statistical model identification , 1974 .

[39]  R. Cattell The Scree Test For The Number Of Factors. , 1966, Multivariate behavioral research.

[40]  Henk A. L. Kiers,et al.  Hierarchical relations between methods for simultaneous component analysis and a technique for rotation to a simple simultaneous structure , 1994 .

[41]  H. Kaiser The varimax criterion for analytic rotation in factor analysis , 1958 .

[42]  L. F. Barrett Discrete Emotions or Dimensions? The Role of Valence Focus and Arousal Focus , 1998 .

[43]  A. Tellegen,et al.  PERSONALITY PROCESSES AND INDIVIDUAL DIFFERENCES An Alternative "Description of Personality": The Big-Five Factor Structure , 2022 .

[44]  Roger E. Millsap,et al.  On component analyses , 1985 .

[45]  Marieke E. Timmerman,et al.  Four simultaneous component models for the analysis of multivariate time series from more than one subject to model intraindividual and interindividual differences , 2003 .

[46]  Eva Ceulemans,et al.  Clusterwise simultaneous component analysis for analyzing structural differences in multivariate multiblock data. , 2012, Psychological methods.