A comparison of three methods for principal component analysis of fuzzy interval data

Vertices Principal Component Analysis (V-PCA), and Centers Principal Component Analysis (C-PCA) generalize Principal Component Analysis (PCA) in order to summarize interval valued data. Neural Network Principal Component Analysis (NN-PCA) represents an extension of PCA for fuzzy interval data. However, also the first two methods can be used for analyzing fuzzy interval data, but they then ignore the spread information. In the literature, the V-PCA method is usually considered computationally cumbersome because it requires the transformation of the interval valued data matrix into a single valued data matrix the number of rows of which depends exponentially on the number of variables and linearly on the number of observation units. However, it has been shown that this problem can be overcome by considering the cross-products matrix which is easy to compute. A review of C-PCA and V-PCA (which hence also includes the computational short-cut to V-PCA) and NN-PCA is provided. Furthermore, a comparison is given of the three methods by means of a simulation study and by an application to an empirical data set. In the simulation study, fuzzy interval data are generated according to various models, and it is reported in which conditions each method performs best.

[1]  Weldon A. Lodwick,et al.  Special issue: interfaces between fuzzy set theory and interval analysis , 2003, Fuzzy Sets Syst..

[2]  H. Kiers Some procedures for displaying results from three‐way methods , 2000 .

[3]  Pierre Cazes Analyse factorielle d'un tableau de lois de probabilité , 2002 .

[4]  Hans-Jürgen Zimmermann,et al.  Fuzzy Set Theory - and Its Applications , 1985 .

[5]  H. Bourlard,et al.  Auto-association by multilayer perceptrons and singular value decomposition , 1988, Biological Cybernetics.

[6]  Paolo Giordani,et al.  Principal Component Analysis of symmetric fuzzy data , 2004, Comput. Stat. Data Anal..

[7]  Francesco Palumbo,et al.  Principal component analysis of interval data: a symbolic data analysis approach , 2000, Comput. Stat..

[8]  P. Giordani,et al.  Component Models for Fuzzy Data , 2006 .

[9]  Lotfi A. Zadeh,et al.  Fuzzy Sets , 1996, Inf. Control..

[10]  E. Diday,et al.  Extension de l'analyse en composantes principales à des données de type intervalle , 1997 .

[11]  Lotfi A. Zadeh,et al.  The Concepts of a Linguistic Variable and its Application to Approximate Reasoning , 1975 .

[12]  Thierry Denoeux,et al.  Principal component analysis of fuzzy data using autoassociative neural networks , 2004, IEEE Transactions on Fuzzy Systems.

[13]  Lotfi A. Zadeh,et al.  The concept of a linguistic variable and its application to approximate reasoning-III , 1975, Inf. Sci..

[14]  Jacqueline J. Meulman,et al.  New Developments in Psychometrics. , 2003 .

[15]  Junzo Watada,et al.  Fuzzy Principal Component Analysis and Its Application , 1997 .

[16]  Kurt Hornik,et al.  Neural networks and principal component analysis: Learning from examples without local minima , 1989, Neural Networks.

[17]  Pierpaolo D'Urso,et al.  A possibilistic approach to latent component analysis for symmetric fuzzy data , 2005, Fuzzy Sets Syst..

[18]  P. Giordani,et al.  A least squares approach to principal component analysis for interval valued data , 2004 .

[19]  F. Palumbo,et al.  A PCA for interval-valued data based on midpoints and radii , 2003 .

[20]  Hans-Hermann Bock,et al.  Analysis of Symbolic Data: Exploratory Methods for Extracting Statistical Information from Complex Data , 2000 .

[21]  Paolo Giordani,et al.  Three‐way component analysis of interval‐valued data , 2004 .