An Eigenvalue Test for spatial Principal Component Analysis

The spatial Principal Component Analysis (sPCA, Jombart 2008) is designed to investigate non-random spatial distributions of genetic variation. Unfortunately, the associated tests used for assessing the existence of spatial patterns (global and local test; Jombart et al. 2008) lack statistical power and may fail to reveal existing spatial patterns. Here, we present a non-parametric test for the significance of specific patterns recovered by sPCA. We compared the performance of this new test to the original global and local tests using datasets simulated under classical population genetic models. Results show that our test outperforms the original global and local tests, exhibiting improved statistical power while retaining similar, and reliable type I errors. Moreover, by allowing to test various sets of axes, it can be used to guide the selection of retained sPCA components. As such, it represents a valuable complement to the original analysis, and should prove useful for the investigation of spatial genetic patterns.

[1]  H. Hotelling Analysis of a complex of statistical variables into principal components. , 1933 .

[2]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[3]  M. Stephens,et al.  Interpreting principal component analyses of spatial population genetic variation , 2008, Nature Genetics.

[4]  Donald A. Jackson,et al.  How many principal components? stopping rules for determining the number of non-trivial axes revisited , 2005, Comput. Stat. Data Anal..

[5]  Thibaut Jombart,et al.  adegenet: a R package for the multivariate analysis of genetic markers , 2008, Bioinform..

[6]  T Jombart,et al.  Revealing cryptic spatial patterns in genetic variability by a new multivariate method , 2008, Heredity.

[7]  Guangqing Chi,et al.  Applied Spatial Data Analysis with R , 2015 .

[8]  T Jombart,et al.  Genetic markers in the playground of multivariate analysis , 2009, Heredity.

[9]  P. Moran Notes on continuous stochastic phenomena. , 1950, Biometrika.

[10]  Antonio Carvajal-Rodríguez,et al.  GENOMEPOP: A program to simulate genomes in populations , 2008, BMC Bioinformatics.

[11]  Thibaut Jombart,et al.  adegenet 1.3-1: new tools for the analysis of genome-wide SNP data , 2011, Bioinform..

[12]  S. Cushman,et al.  Spurious correlations and inference in landscape genetics , 2010, Molecular ecology.

[13]  F. Balloux,et al.  Discriminant analysis of principal components: a new method for the analysis of genetically structured populations , 2010, BMC Genetics.

[14]  Karl Pearson F.R.S. LIII. On lines and planes of closest fit to systems of points in space , 1901 .

[15]  Aurélie Coulon,et al.  Identifying future research needs in landscape genetics: where to from here? , 2009, Landscape Ecology.

[16]  D. Comas,et al.  The influence of habitats on female mobility in Central and Western Africa inferred from human mitochondrial variation , 2013, BMC Evolutionary Biology.

[17]  J. Travis,et al.  ALADYN - a spatially explicit, allelic model for simulating adaptive dynamics. , 2014, Ecography.