Clusterwise simultaneous component analysis for analyzing structural differences in multivariate multiblock data.

Many studies yield multivariate multiblock data, that is, multiple data blocks that all involve the same set of variables (e.g., the scores of different groups of subjects on the same set of variables). The question then rises whether the same processes underlie the different data blocks. To explore the structure of such multivariate multiblock data, component analysis can be very useful. Specifically, 2 approaches are often applied: principal component analysis (PCA) on each data block separately and different variants of simultaneous component analysis (SCA) on all data blocks simultaneously. The PCA approach yields a different loading matrix for each data block and is thus not useful for discovering structural similarities. The SCA approach may fail to yield insight into structural differences, since the obtained loading matrix is identical for all data blocks. We introduce a new generic modeling strategy, called clusterwise SCA, that comprises the separate PCA approach and SCA as special cases. The key idea behind clusterwise SCA is that the data blocks form a few clusters, where data blocks that belong to the same cluster are modeled with SCA and thus have the same structure, and different clusters have different underlying structures. In this article, we use the SCA variant that imposes equal average cross-products constraints (ECP). An algorithm for fitting clusterwise SCA-ECP solutions is proposed and evaluated in a simulation study. Finally, the usefulness of clusterwise SCA is illustrated by empirical examples from eating disorder research and social psychology.

[1]  Maurizio Vichi,et al.  Three-Mode Component Analysis with Crisp or Fuzzy Partition of Units , 2005 .

[2]  I. Jolliffe Principal Component Analysis , 2002 .

[3]  R. Gorsuch,et al.  Effects of under- and overextraction on principal axis factor analysis with varimax rotation. , 1996 .

[4]  Henk A. L. Kiers,et al.  Hierarchical relations between methods for simultaneous component analysis and a technique for rotation to a simple simultaneous structure , 1994 .

[5]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[6]  Eva Ceulemans,et al.  How to perform multiblock component analysis in practice , 2011, Behavior Research Methods.

[7]  D. Espelage,et al.  Relations among exercise, coping, disordered eating, and psychological health among college students. , 2004, Eating behaviors.

[8]  Gavin L. Fox,et al.  Cautionary Remarks on the Use of Clusterwise Regression , 2008, Multivariate behavioral research.

[9]  Francis Tuerlinckx,et al.  A Hierarchical Ornstein–Uhlenbeck Model for Continuous Repeated Measurement Data , 2009 .

[10]  S. Kennedy,et al.  The role of physical activity in the development and maintenance of eating disorders , 1994, Psychological Medicine.

[11]  Kamel Jedidi,et al.  STEMM: A General Finite Mixture Structural Equation Model , 1997 .

[12]  L. Tucker A METHOD FOR SYNTHESIS OF FACTOR ANALYSIS STUDIES , 1951 .

[13]  Eric van Dijk,et al.  Coordination rules in asymmetric social dilemmas: a comparison between public good dilemmas and resource dilemmas , 1995 .

[14]  Neil J. MacKinnon,et al.  The Structure of Emotions: Canada-United States Comparisons , 1989 .

[15]  J. Russell,et al.  Core affect, prototypical emotional episodes, and other things called emotion: dissecting the elephant. , 1999, Journal of personality and social psychology.

[16]  K. Scherer,et al.  The World of Emotions is not Two-Dimensional , 2007, Psychological science.

[17]  P. Kroonenberg Applied Multiway Data Analysis , 2008 .

[18]  E. Diener,et al.  Norms for experiencing emotions in different cultures: inter- and intranational differences. , 2001, Journal of personality and social psychology.

[19]  Rex B. Kline,et al.  Principles and Practice of Structural Equation Modeling , 1998 .

[20]  J. Norris,et al.  A naturalistic study of the impact of acute physical activity on feeling states and affect in women. , 1996, Health psychology : official journal of the Division of Health Psychology, American Psychological Association.

[21]  Johnny Fontaine,et al.  Cognitive structure of emotion terms in Indonesia and The Netherlands , 2002 .

[22]  Munni Begum,et al.  Positive affect, exercise and self-reported health in blue-collar women. , 2006, American journal of health behavior.

[23]  Hans-Hermann Bock,et al.  On the Interface between Cluster Analysis, Principal Component Analysis, and Multidimensional Scaling , 1987 .

[24]  Helmuth Späth,et al.  Algorithm 39 Clusterwise linear regression , 1979, Computing.

[25]  M. Brusco A Repetitive Branch-and-Bound Procedure for Minimum Within-Cluster Sums of Squares Partitioning , 2006, Psychometrika.

[26]  Cordelia Schmid,et al.  High-dimensional data clustering , 2006, Comput. Stat. Data Anal..

[27]  M. Brusco,et al.  A variable-selection heuristic for K-means clustering , 2001 .

[28]  Michael C. Hout,et al.  Multidimensional Scaling , 2003, Encyclopedic Dictionary of Archaeology.

[29]  Yiu-Fai Yung,et al.  Finite mixtures in confirmatory factor-analysis models , 1997 .

[30]  B. Fredrickson,et al.  Psychological resilience and positive emotional granularity: examining the benefits of positive emotions on coping and health. , 2004, Journal of personality.

[31]  Karl Pearson F.R.S. LIII. On lines and planes of closest fit to systems of points in space , 1901 .

[32]  Age K Smilde,et al.  Bootstrap confidence intervals in multi-level simultaneous component analysis. , 2009, The British journal of mathematical and statistical psychology.

[33]  K. G. J8reskoC,et al.  Simultaneous Factor Analysis in Several Populations , 2007 .

[34]  Jeroen K. Vermunt,et al.  7. Multilevel Latent Class Models , 2003 .

[35]  Pierre Hansen,et al.  An improved column generation algorithm for minimum sum-of-squares clustering , 2009, Math. Program..

[36]  H. L. Le Roy,et al.  Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability; Vol. IV , 1969 .

[37]  J. Carroll,et al.  K-means clustering in a low-dimensional Euclidean space , 1994 .

[38]  Joseph S. Verducci,et al.  Multivariate Statistical Modeling and Data Analysis. , 1988 .

[39]  R. Baumeister,et al.  Binge eating as escape from self-awareness. , 1991, Psychological bulletin.

[40]  T. Dalgleish,et al.  Handbook of cognition and emotion , 1999 .

[41]  Eva Ceulemans,et al.  Factorial and reduced K-means reconsidered , 2010, Comput. Stat. Data Anal..

[42]  S E Solenberger,et al.  Exercise and eating disorders: a 3-year inpatient hospital record analysis. , 2001, Eating behaviors.

[43]  Maurizio Vichi,et al.  Simultaneous Component and Clustering Models for Three-way Data: Within and Between Approaches , 2007, J. Classif..

[44]  I. Mechelen,et al.  SCA with rotation to distinguish common and distinctive information in linked data , 2013, Behavior Research Methods.

[45]  William R. Dillon Latent Class and Finite Mixture Models , 2010 .

[46]  D. Sörbom A GENERAL METHOD FOR STUDYING DIFFERENCES IN FACTOR MEANS AND FACTOR STRUCTURE BETWEEN GROUPS , 1974 .

[47]  W. Hays Statistics for psychologists , 1963 .

[48]  M. Schader,et al.  New Approaches in Classification and Data Analysis , 1994 .

[49]  C. Davis,et al.  Compulsive physical activity in adolescents with anorexia nervosa: a psychobehavioral spiral of pathology. , 1999, The Journal of nervous and mental disease.

[50]  T. Berge Least squares optimization in multivariate analysis , 2005 .

[51]  Marieke E. Timmerman,et al.  Four simultaneous component models for the analysis of multivariate time series from more than one subject to model intraindividual and interindividual differences , 2003 .

[52]  Bengt Muthén,et al.  Multilevel latent variable modeling in multiple populations , 1997 .

[53]  Jelte M. Wicherts,et al.  Testing Measurement Invariance in the Target Rotated Multigroup Exploratory Factor Model , 2009 .

[54]  David J. Hessen,et al.  The Multigroup Common Factor Model With Minimal Uniqueness Constraints and the Power to Detect Uniform Bias , 2006 .

[55]  E van Dijk,et al.  Decision-induced focusing in social dilemmas: give-some, keep-some, take-some, and leave-some dilemmas. , 2000, Journal of personality and social psychology.

[56]  T. Haavelmo The Statistical Implications of a System of Simultaneous Equations , 1943 .

[57]  B. Everitt,et al.  A Monte Carlo Study of the Recovery of Cluster Structure in Binary Data by Hierarchical Clustering Techniques. , 1987, Multivariate behavioral research.

[58]  H. Kiers,et al.  Factorial k-means analysis for two-way data , 2001 .

[59]  F.J.R. van de Vijver,et al.  Methods and Data Analysis for Cross-Cultural Research , 1997 .

[60]  W. David Pierce,et al.  A theory of activity-based anorexia , 1983 .

[61]  Craig A. Smith,et al.  From appraisal to emotion: Differences among unpleasant feelings , 1988 .

[62]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[63]  C. Davis,et al.  Eating Disorders and Hyperactivity: A Psychobiological Perspective , 1997, Canadian journal of psychiatry. Revue canadienne de psychiatrie.

[64]  Olejnik,et al.  Measures of Effect Size for Comparative Studies: Applications, Interpretations, and Limitations. , 2000, Contemporary educational psychology.

[65]  T. Dalgleish Basic Emotions , 2004 .

[66]  Douglas Steinley,et al.  Local optima in K-means clustering: what you don't know may hurt you. , 2003, Psychological methods.

[67]  Roger E. Millsap,et al.  Component analysis in cross-sectional and longitudinal data , 1988 .

[68]  R. Cattell The Scree Test For The Number Of Factors. , 1966, Multivariate behavioral research.

[69]  Eva Ceulemans,et al.  Tolerance of Justice Violations: The Effects of Need on Emotional Reactions After Violating Equality in Social Dilemmas , 2011 .

[70]  Henk A. M. Wilke,et al.  Decision-induced focusing in social dilemmas: give-some, keep-some, take-some, and leave-some dilemmas. , 2000, Journal of personality and social psychology.

[71]  L. Tucker,et al.  An individual differences model for multidimensional scaling , 1963 .

[72]  Jeroen K. Vermunt,et al.  Multilevel Mixture Factor Models , 2012, Multivariate behavioral research.

[73]  D. Jackson,et al.  A Comparison Of Component And Factor Patterns: A Monte Carlo Approach. , 1982, Multivariate behavioral research.

[74]  C. Davis,et al.  Obsessionality in Anorexia Nervosa: The Moderating Influence of Exercise , 1998, Psychosomatic medicine.

[75]  Helmuth Späth,et al.  A fast algorithm for clusterwise linear regression , 1982, Computing.

[76]  Wayne S. DeSarbo,et al.  A simulated annealing methodology for clusterwise linear regression , 1989 .

[77]  Jeroen K. Vermunt,et al.  Multilevel latent variable modeling : An application in educational testing , 2008 .

[78]  H. Kaiser The varimax criterion for analytic rotation in factor analysis , 1958 .

[79]  A. E. Maxwell,et al.  Factor Analysis as a Statistical Method. , 1964 .

[80]  W. Velicer,et al.  The Effects of Overextraction on Factor and Component Analysis. , 1992, Multivariate behavioral research.

[81]  Roger E. Millsap,et al.  On component analyses , 1985 .

[82]  David De Cremer,et al.  All is well that ends well, at least for proselfs: Emotional reactions to equality violation as a function of social value orientation , 2005 .

[83]  P. Beumont,et al.  Excessive physical activity in dieting disorder patients: proposals for a supervised exercise program. , 1994, The International journal of eating disorders.

[84]  Frank Rijmen,et al.  Drive for thinness, affect regulation and physical activity in eating disorders: a daily life study. , 2007, Behaviour research and therapy.

[85]  Iven Van Mechelen,et al.  UvA-DARE ( Digital Academic Repository ) A structured overview of simultaneous component based data integration , 2009 .

[86]  Han L. J. van der Maas,et al.  Fitting multivariage normal finite mixtures subject to structural equation modeling , 1998 .

[87]  L. F. Barrett Discrete Emotions or Dimensions? The Role of Valence Focus and Arousal Focus , 1998 .

[88]  G. W. Milligan,et al.  The Effect of Cluster Size, Dimensionality, and the Number of Clusters on Recovery of True Cluster Structure , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[89]  Glenn Waller,et al.  Excessive exercise in anorexia nervosa and bulimia nervosa: relation to eating characteristics and general psychopathology. , 2002, The International journal of eating disorders.

[90]  A. T. Church,et al.  The Structure And Personality Correlates Of Affect In Mexico , 2003 .

[91]  L. Hubert,et al.  Comparing partitions , 1985 .

[92]  M. Brusco,et al.  ConPar: a method for identifying groups of concordant subject proximity matrices for subsequent multidimensional scaling analyses , 2005 .

[93]  Beate Herpertz-Dahlmann,et al.  The contribution of anxiety and food restriction on physical activity levels in acute anorexia nervosa. , 2004, The International journal of eating disorders.

[94]  R. Kline Principles and practice of structural equation modeling, 2nd ed. , 2005 .

[95]  A. Basilevsky,et al.  Factor Analysis as a Statistical Method. , 1964 .