Clustering of Variables Around Latent Components

Abstract Clustering of variables around latent components is investigated as a means to organize multivariate data into meaningful structures. The coverage includes (i) the case where it is desirable to lump together correlated variables no matter whether the correlation coefficient is positive or negative; (ii) the case where negative correlation shows high disagreement among variables; (iii) an extension of the clustering techniques which makes it possible to explain the clustering of variables taking account of external data. The strategy basically consists in performing a hierarchical cluster analysis, followed by a partitioning algorithm. Both algorithms aim at maximizing the same criterion which reflects the extent to which variables in each cluster are related to the latent variable associated with this cluster. Illustrations are outlined using real data sets from sensory studies.

[1]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[2]  J. N. R. Jeffers,et al.  Two Case Studies in the Application of Principal Component Analysis , 1967 .

[3]  Ian T. Jolliffe,et al.  Discarding Variables in a Principal Component Analysis. I: Artificial Data , 1972 .

[4]  I. Jolliffe Discarding Variables in a Principal Component Analysis. Ii: Real Data , 1973 .

[5]  Y. Escoufier LE TRAITEMENT DES VARIABLES VECTORIELLES , 1973 .

[6]  Brian Everitt,et al.  Cluster analysis , 1974 .

[7]  H. Wold Path Models with Latent Variables: The NIPALS Approach , 1975 .

[8]  J. Overall,et al.  Applied multivariate analysis , 1983 .

[9]  S. Wold,et al.  The Collinearity Problem in Linear Regression. The Partial Least Squares (PLS) Approach to Generalized Inverses , 1984 .

[10]  John C. Gower,et al.  Measures of Similarity, Dissimilarity and Distance , 1985 .

[11]  D. Rubin,et al.  Statistical Analysis with Missing Data. , 1989 .

[12]  W. Krzanowski Selection of Variables to Preserve Multivariate Data Structure, Using Principal Components , 1987 .

[13]  Wojtek J. Krzanowski,et al.  A COMPARISON OF VARIABLE REDUCTION TECHNIQUES IN AN ATTITUDINAL INVESTIGATION OF MEAT PRODUCTS , 1988 .

[14]  Sabatier Robert,et al.  Principal component analysis with instrumental variables as a tool for modelling composition data Daniel , 1989 .

[15]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .

[16]  H. J. H. MacFie,et al.  Preference mapping in practice , 1994 .

[17]  P. Garthwaite An Interpretation of Partial Least Squares , 1994 .

[18]  Jorge Cadima Departamento de Matematica Loading and correlations in the interpretation of principle compenents , 1995 .

[19]  R. Sabatier,et al.  Refined approximations to permutation tests for multivariate inference , 1995 .

[20]  Pascal Schlich,et al.  Defining and Validating Assessor Compromises About Product Distances and Attribute Correlations , 1996 .

[21]  Evelyne Vigneau,et al.  Clustering of variables, application in consumer and sensory studies , 1997 .

[22]  Joseph L Schafer,et al.  Analysis of Incomplete Multivariate Data , 1997 .

[23]  Evelyne Vigneau,et al.  Une nouvelle distance entre variables. Application en classification , 1998 .

[24]  Classification d'un ensemble de variables qualitatives , 1998 .

[25]  A European sensory and consumer study—A case study on coffee , 1998 .

[26]  Gabriele Soffritti,et al.  Hierarchical clustering of variables: a comparison among strategies of analysis , 1999 .

[27]  S. Vines Simple principal components , 2000 .

[28]  John W. Graham,et al.  Multiple imputation in multivariate research. , 2000 .

[29]  Evelyne Vigneau,et al.  Segmentation of a panel of consumers using clustering of variables around latent directions of preference , 2001 .

[30]  Ian T. Jolliffe,et al.  VARIABLE SELECTION AND INTERPRETATION OF COVARIANCE PRINCIPAL COMPONENTS , 2001 .

[31]  Liisa Lähteenmäki,et al.  Food neophobia among the Finns and related responses to familiar and unfamiliar foods , 2001 .

[32]  Frank Westad,et al.  Gender specific preferences and attitudes towards meat , 2002 .

[33]  Desire L. Massart,et al.  Feature selection in principal component analysis of analytical data , 2002 .

[34]  Patricio Cumsille,et al.  Methods for Handling Missing Data , 2003 .

[35]  C. Delahunty,et al.  Which juice is 'healthier'? A consumer study of probiotic non-dairy juice drinks , 2004 .

[36]  D. B. Hibbert Multivariate calibration and classification - T. Naes, T. Isaksson, T. Fearn and T. Davis, NIR Publications, Chichester, 2002, ISBN 0 9528666 2 5, UK @$45.00, US$75.00 , 2004 .

[37]  Anette Kistrup Thybo,et al.  Explaining Danish children's preferences for apples using instrumental, sensory and demographic/behavioural data , 2004 .