Principal component analysis on interval data

SummaryReal world data analysis is often affected by different types of errors as: measurement errors, computation errors, imprecision related to the method adopted for estimating the data.The uncertainty in the data, which is strictly connected to the above errors, may be treated by considering, rather than a single value for each data, the interval of values in which it may fall: the interval data. Statistical units described by interval data can be assumed as a special case of Symbolic Object (SO). In Symbolic Data Analysis (SDA), these data are represented as boxes. Accordingly, purpose of the present work is the extension of Principal Component analysis (PCA) to obtain a visualisation of such boxes, on a lower dimensional space pointing out of the relationships among the variables, the units, and between both of them. The aim is to use, when possible, the interval algebra instruments to adapt the mathematical models, on the basis of the classical PCA, to the case in which an interval data matrix is given. The proposed method has been tested on a real data set and the numerical results, which are in agreement with the theory, are reported.

[1]  A. S. Deif,et al.  Singular values of an interval matrix , 1991 .

[2]  F. Palumbo,et al.  A PCA for interval-valued data based on midpoints and radii , 2003 .

[3]  Ahlame Douzal-Chouakria Extension des méthodes d'analyse factorielles à des données de type intervalle , 1998 .

[4]  Carlo Lauro,et al.  BASIC STATISTICAL METHODS FOR INTERVAL DATA , 2005 .

[5]  R. Young The algebra of many-valued quantities , 1931 .

[6]  G. Alefeld,et al.  Introduction to Interval Computation , 1983 .

[7]  L. Billard,et al.  Regression Analysis for Interval-Valued Data , 2000 .

[8]  A. Neumaier Interval methods for systems of equations , 1990 .

[9]  Rosanna Verde,et al.  Factorial Methods with Cohesion Constraints on Symbolic Objects , 2000 .

[10]  Francesco Palumbo,et al.  Principal component analysis of interval data: a symbolic data analysis approach , 2000, Comput. Stat..

[11]  A. Deif,et al.  The Interval Eigenvalue Problem , 1991 .

[12]  L. Billard,et al.  Symbolic Regression Analysis , 2002 .

[13]  A. S. Deif,et al.  On the invariance of the sign pattern of matrix eigenvectors under perturbation , 1994 .

[14]  E. Diday,et al.  Extension de l'analyse en composantes principales à des données de type intervalle , 1997 .

[15]  N. P. Seif,et al.  Bounding the Eigenvectors for Symmetric Interval Matrices , 1992 .

[16]  J. Rohn Interval matrices: singularity and real eigenvalues , 1993 .

[17]  J. C. Burkill Functions of Intervals , 1924 .

[18]  R. B. Kearfott,et al.  Applications of interval computations , 1996 .