Clustering of variables around latent components: an application in consumer science

The present work proposes a method based on CLV (Clustering around Latent Variables) for identifying groups of consumers in L-shape data. This kind of datastructure is very common in consumer studies where a panel of consumers is asked to assess the global liking of a certain number of products and then, preference scores are arranged in a two-way table Y. External information on both products (physicalchemical description or sensory attributes) and consumers (socio-demographic background, purchase behaviours or consumption habits) may be available in a row descriptor matrix X and in a column descriptor matrix Z respectively. The aim of this method is to automatically provide a consumer segmentation where all the three matrices play an active role in the classification, getting homogeneous groups from all points of view: preference, products and consumer characteristics. The proposed clustering method is illustrated on data from preference studies on food products: juices based on berry fruits and traditional cheeses from Trentino. The hedonic ratings given by the consumer panel on the products under study were explained with respect to the product chemical compounds, sensory evaluation and consumer socio-demographic information, purchase behaviour and consumption habits.

[1]  V. Framondino,et al.  Ruolo dell'analisi sensoriale nella definizione delle caratteristiche dei prodotti tipici: l'esempio dei formaggi trentini , 2004 .

[2]  Karin Sahmer autour de composantes latentes. Application en evaluation sensorielle , 2006 .

[3]  D. B. Hibbert Multivariate calibration and classification - T. Naes, T. Isaksson, T. Fearn and T. Davis, NIR Publications, Chichester, 2002, ISBN 0 9528666 2 5, UK @$45.00, US$75.00 , 2004 .

[4]  W. Krzanowski Selection of Variables to Preserve Multivariate Data Structure, Using Principal Components , 1987 .

[5]  Harald Martens,et al.  Regression of a data matrix on descriptors of both its rows and of its columns via latent variables: L-PLSR , 2005, Comput. Stat. Data Anal..

[6]  Evelyne Vigneau,et al.  A cluster approach to analyze preference data: Choice of the number of clusters , 2006 .

[7]  John C. Gower,et al.  Measures of Similarity, Dissimilarity and Distance , 1985 .

[8]  R. Singleton,et al.  Sensory Evaluation by Quantitative Descriptive Analysis , 2008 .

[9]  R. Sabatier,et al.  Refined approximations to permutation tests for multivariate inference , 1995 .

[10]  Evelyne Vigneau,et al.  Segmentation of a panel of consumers using clustering of variables around latent directions of preference , 2001 .

[11]  John W. Graham,et al.  Multiple imputation in multivariate research. , 2000 .

[12]  J. Overall,et al.  Applied multivariate analysis , 1983 .

[13]  Martin Kermit,et al.  3-Way and 3-block PLS regressions in consumer preference analysis , 2006 .

[14]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[15]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[16]  S. Wold,et al.  The Collinearity Problem in Linear Regression. The Partial Least Squares (PLS) Approach to Generalized Inverses , 1984 .

[17]  E. Vigneau,et al.  Clustering of Variables Around Latent Components , 2003 .

[18]  R Hardy,et al.  Methods for handling missing data , 2009 .

[19]  H. L. Le Roy,et al.  Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability; Vol. IV , 1969 .

[20]  Michel Tenenhaus,et al.  PLS methodology to study relationships between hedonic judgements and product characteristics , 2005 .

[21]  Pascal Schlich,et al.  Defining and Validating Assessor Compromises About Product Distances and Attribute Correlations , 1996 .

[22]  Ian T. Jolliffe,et al.  VARIABLE SELECTION AND INTERPRETATION OF COVARIANCE PRINCIPAL COMPONENTS , 2001 .

[23]  H. Wold Path Models with Latent Variables: The NIPALS Approach , 1975 .

[24]  C. Delahunty,et al.  Which juice is 'healthier'? A consumer study of probiotic non-dairy juice drinks , 2004 .

[25]  G. Smith,et al.  Food research and data analysis , 1983 .

[26]  M. Kendall A course in multivariate analysis , 1958 .

[27]  Sabatier Robert,et al.  Principal component analysis with instrumental variables as a tool for modelling composition data Daniel , 1989 .

[28]  Rolph E. Anderson,et al.  Multivariate data analysis with readings (2nd ed.) , 1986 .

[29]  Evelyne Vigneau,et al.  Une nouvelle distance entre variables. Application en classification , 1998 .

[30]  H. J. H. MacFie,et al.  Preference mapping in practice , 1994 .

[31]  B. Kowalski,et al.  Partial least-squares regression: a tutorial , 1986 .

[32]  P. D. Bricker,et al.  Individual Differences and Multidimensional Scaling of Speech Perception Data , 1971 .

[33]  E. B. Zechmeister,et al.  Research Methods in Psychology. , 1990 .

[34]  Vincenzo Esposito Vinzi,et al.  Two-step PLS regression for L-structured data: an application in the cosmetic industry , 2007, Stat. Methods Appl..

[35]  李幼升,et al.  Ph , 1989 .

[36]  Evelyne Vigneau,et al.  Classification de variables autour de composantes latentes , 2006 .

[37]  Liisa Lähteenmäki,et al.  Food neophobia among the Finns and related responses to familiar and unfamiliar foods , 2001 .

[38]  Classification d'un ensemble de variables qualitatives , 1998 .

[39]  Joseph L Schafer,et al.  Analysis of Incomplete Multivariate Data , 1997 .

[40]  Desire L. Massart,et al.  Feature selection in principal component analysis of analytical data , 2002 .

[41]  J. Schafer,et al.  Missing data: our view of the state of the art. , 2002, Psychological methods.

[42]  Anette Kistrup Thybo,et al.  Explaining Danish children's preferences for apples using instrumental, sensory and demographic/behavioural data , 2004 .

[43]  Ian T. Jolliffe,et al.  Discarding Variables in a Principal Component Analysis. I: Artificial Data , 1972 .

[44]  D. Rubin,et al.  Statistical Analysis with Missing Data , 1988 .

[45]  Gabriele Soffritti,et al.  Hierarchical clustering of variables: a comparison among strategies of analysis , 1999 .

[46]  Evelyne Vigneau,et al.  Segmentation of consumers taking account of external data. A clustering of variables approach , 2002 .

[47]  Flavia Gasperi,et al.  Judge selection for hard and semi-hard cheese sensory evaluation , 2000 .

[48]  Evelyne Vigneau,et al.  Clustering of variables, application in consumer and sensory studies , 1997 .

[49]  Y. Escoufier LE TRAITEMENT DES VARIABLES VECTORIELLES , 1973 .

[50]  Frank Westad,et al.  Gender specific preferences and attitudes towards meat , 2002 .

[51]  Brian Everitt,et al.  Cluster analysis , 1974 .