Measuring Subcompositional Incoherence

Subcompositional coherence is a fundamental property of Aitchison’s approach to compositional data analysis and is the principal justification for using ratios of components. We maintain, however, that lack of subcompositional coherence (i.e., incoherence) can be measured in an attempt to evaluate whether any given technique is close enough, for all practical purposes, to being subcompositionally coherent. This opens up the field to alternative methods that might be better suited to cope with problems such as data zeros and outliers while being only slightly incoherent. The measure that we propose is based on the distance measure between components. We show that the two-part subcompositions, which appear to be the most sensitive to subcompositional incoherence, can be used to establish a distance matrix that can be directly compared with the pairwise distances in the full composition. The closeness of these two matrices can be quantified using a stress measure that is common in multidimensional scaling, providing a measure of subcompositional incoherence. The approach is illustrated using power-transformed correspondence analysis, which has already been shown to converge to log-ratio analysis as the power transform tends to zero.

[1]  J. Aitchison Principal component analysis of compositional data , 1983 .

[2]  John Aitchison,et al.  The Statistical Analysis of Compositional Data , 1986 .

[3]  L. A. Goodman The Analysis of Cross-Classified Data: Independence, Quasi-Independence, and Interactions in Contingency Tables with or without Missing Entries , 1968 .

[4]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[5]  Jean Thioulouse,et al.  The ade4 package - I : One-table methods , 2004 .

[6]  Raimon Tolosana Delgado,et al.  Lecture Notes on Compositional Data Analysis , 2007 .

[7]  Jean Thioulouse,et al.  Interactive Multivariate Data Analysis in R with the ade4 and ade4TkGUI Packages , 2007 .

[8]  Michael Greenacre,et al.  Log-Ratio Analysis Is a Limiting Case of Correspondence Analysis , 2009 .

[9]  M. Greenacre Correspondence Analysis in Practice, Second Edition , 2007 .

[10]  Michael Greenacre,et al.  A Comparison of Different Methods for Representing Categorical Data , 2006 .

[11]  A comparison of methods for analyzing intraindividual change in student epistemological orientation during the transition to college , 1992 .

[12]  R. Clarke,et al.  Theory and Applications of Correspondence Analysis , 1985 .

[13]  J. Aitchison,et al.  Biplots of Compositional Data , 2002 .

[14]  M. Greenacre Correspondence analysis in practice , 1993 .

[15]  V. Pawlowsky-Glahn,et al.  Dealing with Zeros and Missing Values in Compositional Data Sets Using Nonparametric Imputation , 2003 .

[16]  Michael Greenacre,et al.  Distributional Equivalence and Subcompositional Coherence in the Analysis of Compositional Data, Contingency Tables and Ratio-Scale Measurements , 2009, J. Classif..

[17]  G. Box An analysis of transformations (with discussion) , 1964 .

[18]  Michael Greenacre,et al.  Power Transformations in Correspondence Analysis , 2007, Comput. Stat. Data Anal..

[19]  John Aitchison,et al.  Relative variation diagrams for describing patterns of compositional variability , 1990 .

[20]  P. Groenen,et al.  Modern multidimensional scaling , 1996 .

[21]  Mike Baxter,et al.  Principal component and correspondence analysis of compositional data: some similarities , 1990 .

[22]  F. Chayes On correlation between variables of constant sum , 1960 .

[23]  Michael J. Greenacre La práctica del análisis de correspondencias , 2008 .