Spatial Regression Modeling for Compositional Data With Many Zeros

Compositional data analysis considers vectors of nonnegative-valued variables subject to a unit-sum constraint. Our interest lies in spatial compositional data, in particular, land use/land cover (LULC) data in the northeastern United States. Here, the observations are vectors providing the proportions of LULC types observed in each 3 km×3 km grid cell, yielding order 104 cells. On the same grid cells, we have an additional compositional dataset supplying forest fragmentation proportions. Potentially useful and available covariates include elevation range, road length, population, median household income, and housing levels.We propose a spatial regression model that is also able to capture flexible dependence among the components of the observation vectors at each location as well as spatial dependence across the locations of the simplex-restricted measurements. A key issue is the high incidence of observed zero proportions for the LULC dataset, requiring incorporation of local point masses at 0. We build a hierarchical model prescribing a power scaling first stage and using latent variables at the second stage with spatial structure for these variables supplied through a multivariate CAR specification. Analyses for the LULC and forest fragmentation data illustrate the interpretation of the regression coefficients and the benefit of incorporating spatial smoothing.

[1]  Michael A. Stephens,et al.  Use of the von Mises distribution to analyse continuous proportions , 1982 .

[2]  C. F. Sirmans,et al.  Nonstationary multivariate process modeling through spatially varying coregionalization , 2004 .

[3]  Chris Field,et al.  Managing the Essential Zeros in Quantitative Fatty Acid Signature Analysis , 2011 .

[4]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[5]  R. Reyment Compositional data analysis , 1989 .

[6]  Jane M. Fry,et al.  Compositional data analysis and zeros in micro data , 2000 .

[7]  M. Plummer,et al.  CODA: convergence diagnosis and output analysis for MCMC , 2006 .

[8]  Raimon Tolosana-Delgado,et al.  "compositions": A unified R package to analyze compositional data , 2008, Comput. Geosci..

[9]  Håkon Tjelmeland,et al.  Bayesian modelling of spatial compositional data , 2003 .

[10]  J. Kent The Fisher‐Bingham Distribution on the Sphere , 1982 .

[11]  Sw. Banerjee,et al.  Hierarchical Modeling and Analysis for Spatial Data , 2003 .

[12]  Simon P. Wilson,et al.  Bayesian palaeoclimate reconstruction , 2006 .

[13]  Murali Haran,et al.  Dimension reduction and alleviation of confounding for spatial generalized linear mixed models , 2010, 1011.6649.

[14]  K. Mardia Multi-dimensional multivariate Gaussian Markov random fields with application to image processing , 1988 .

[15]  A. Wood,et al.  A data-based power transformation for compositional data , 2011, 1106.1451.

[16]  M. C. Jones Families of distributions arising from distributions of order statistics , 2004 .

[17]  A. Gelfand,et al.  Modeling large scale species abundance with latent spatial processes , 2010, 1011.3327.

[18]  Debra K. Meyer,et al.  Completion of the National Land Cover Database (NLCD) 1992–2001 Land Cover Change Retrofit product , 2009 .

[19]  John Aitchison,et al.  The Statistical Analysis of Compositional Data , 1986 .

[20]  A. Gelfand,et al.  Proper multivariate conditional autoregressive models for spatial data analysis. , 2003, Biostatistics.

[21]  V. Zadnik,et al.  Effects of Residual Smoothing on the Posterior of the Fixed Effects in Disease‐Mapping Models , 2006, Biometrics.

[22]  J. Besag Spatial Interaction and the Statistical Analysis of Lattice Systems , 1974 .

[23]  J. Aitchison,et al.  Compositional Data Analysis: Where Are We and Where Should We Be Heading? , 2003 .

[24]  G. Mateu-Figueras,et al.  Isometric Logratio Transformations for Compositional Data Analysis , 2003 .

[25]  J. Besag,et al.  Bayesian Computation and Stochastic Systems , 1995 .

[26]  V. Pawlowsky-Glahn,et al.  Dealing with Zeros and Missing Values in Compositional Data Sets Using Nonparametric Imputation , 2003 .

[27]  Alan H. Welsh,et al.  Regression for compositional data by using distributions defined on the hypersphere , 2011 .

[28]  Jim E. Griffin,et al.  Proceedings of the 21st International Workshop on Statistical Modelling , 2006 .

[29]  Natural variability of benthic species composition in the Delaware Bay , 2004, Environmental and Ecological Statistics.

[30]  A. Raftery,et al.  Strictly Proper Scoring Rules, Prediction, and Estimation , 2007 .