"compositions": A unified R package to analyze compositional data

This contribution presents a new R package, called ''compositions''. It provides tools to analyze amount or compositional data sets in four different geometries, each one associated with an R class: rplus (for amounts, or open compositions, in a real, classical geometry), aplus (for amounts in a logarithmic geometry), rcomp (for closed compositions in a real geometry) and acomp (for closed compositions in a logistic geometry, following a log-ratio approach). The package allows to compare results obtained with these four approaches, since an analogous analysis can be performed according to each geometry, with minimal and straightforward modifications of the instructions. Beside these grounding classes, the package also includes: the most-basic features such as data transformations (e.g.logarithm, or additive logistic transform), basic statistics (both the classical ones, and those developed in the log-ratio framework of compositional analysis), high-level graphics (like ternary diagram matrix and scatter-plots) and high-level analysis (e.g.principal components or cluster analysis). Results of these functions and analysis are also provided in a consistent way among the four geometries, to ease their comparison.

[1]  V. Pawlowsky-Glahn,et al.  Latent Compositional Factors in The Llobregat River Basin (Spain) Hydrogeochemistry , 2005 .

[2]  J Aitchison,et al.  The one-hour course in compositional data analysis or compositional data analysis is simple , 1997 .

[3]  V. Pawlowsky-Glahn,et al.  Geometric approach to statistical analysis on the simplex , 2001 .

[4]  K. Gabriel,et al.  The biplot graphic display of matrices with application to principal component analysis , 1971 .

[5]  P. Guttorp,et al.  Statistical Interpretation of Species Composition , 2001 .

[6]  Mike Baxter,et al.  An R Library for Compositional Data Analysis in Archaeometry , 2005 .

[7]  G. Mateu-Figueras,et al.  Isometric Logratio Transformations for Compositional Data Analysis , 2003 .

[8]  S. Shen,et al.  The statistical analysis of compositional data , 1983 .

[9]  Ezio Marchi Some remarks about the transformation of Charnes and Cooper , 2006 .

[10]  Vera Pawlowsky-Glahn,et al.  Distributions on the simplex , 2003 .

[11]  V. Pawlowsky-Glahn,et al.  Relative vs. absolute statistical analysis of compositions: a comparative study of surface waters of a Mediterranean river. , 2005, Water research.

[12]  J. A. Martín-Fernández,et al.  Dealing with Compositional Data: The Freeware CoDaPack , 2005 .

[13]  R. Tolosana-Delgado,et al.  Compositional data analysis with ‘R’ and the package ‘compositions’ , 2006, Geological Society, London, Special Publications.

[14]  R. Shurtz,et al.  Compositional Geometry and Mass Conservation , 2003 .

[15]  John Aitchison,et al.  The Statistical Analysis of Compositional Data , 1986 .

[16]  F. Chayes On correlation between variables of constant sum , 1960 .

[17]  Pawlowsky-Glahn Statistical modelling on coordinates , .

[18]  R. Tolosana-Delgado,et al.  Some Basic Concepts of Compositional Geometry , 2005 .