Principal component analysis for compositional data vectors

Since Aitchison’s founding research work, compositional data analysis has attracted growing attention in recent decades. As a powerful technique for exploratory analysis, principal component analysis (PCA) has been extended to compositional data. Despite extensive efforts in PCA on compositional data parts as variables, this paper contributes to modeling PCA for compositional data vectors. Based on algebraic operators in Simplex space, the PCA process is deduced and transformed into calculating some inner products. Properties of principal components are also investigated. Two real-data examples illustrate the merits of the proposed PCA for compositional data vectors.

[1]  J. Aitchison Reducing the dimensionality of compositional data sets , 1984 .

[2]  John Aitchison,et al.  The Statistical Analysis of Compositional Data , 1986 .

[3]  Peter Filzmoser,et al.  Robust Statistical Analysis , 2011 .

[4]  Vera Pawlowsky-Glahn,et al.  A Critical Approach to Non-Parametric Classification of Compositional Data , 1998 .

[5]  Allan G. B. Fisher,et al.  PRODUCTION, PRIMARY, SECONDARY AND TERTIARY , 1939 .

[6]  Carlo Lauro,et al.  Principal component analysis on interval data , 2006, Comput. Stat..

[7]  J Aitchison,et al.  The one-hour course in compositional data analysis or compositional data analysis is simple , 1997 .

[8]  V. Pawlowsky-Glahn,et al.  Geometric approach to statistical analysis on the simplex , 2001 .

[9]  J. Palarea‐Albaladejo,et al.  Values below detection limit in compositional chemical data. , 2013, Analytica chimica acta.

[10]  Qiang Liu,et al.  A hyperspherical transformation forecasting model for compositional data , 2007, Eur. J. Oper. Res..

[11]  I. Jolliffe Principal Component Analysis , 2002 .

[12]  David E. Tyler,et al.  Robust functional principal components: A projection-pursuit approach , 2011, 1203.2027.

[13]  Junjie Wu,et al.  CIPCA: Complete-Information-based Principal Component Analysis for interval-valued data , 2012, Neurocomputing.

[14]  E. Diday,et al.  Extension de l'analyse en composantes principales à des données de type intervalle , 1997 .

[15]  James O. Ramsay,et al.  Functional Data Analysis , 2005 .

[16]  P. Filzmoser Robust principal component and factor analysis in the geostatistical treatment of environmental data , 1999 .

[17]  John Bacon-Shone,et al.  A Short History of Compositional Data Analysis , 2011 .

[18]  J. Aitchison Principal component analysis of compositional data , 1983 .

[19]  P. Filzmoser,et al.  Principal component analysis for compositional data with outliers , 2009 .

[20]  V. Pawlowsky-Glahn,et al.  BLU Estimators and Compositional Data , 2002 .

[21]  CoDa in three-way arrays and relative sample spaces , 2012 .

[22]  Michele Gallo,et al.  Log-Ratio and Parallel Factor Analysis: An Approach to Analyze Three-Way Compositional Data , 2013, Advanced Dynamic Modeling of Economic and Social Systems.

[23]  Mariano J. Valderrama,et al.  An overview to modelling functional data , 2007, Comput. Stat..

[24]  J. Aitchison,et al.  Biplots of Compositional Data , 2002 .

[25]  Hyejin Shin,et al.  Functional outlier detection with robust functional principal component analysis , 2011, Computational Statistics.