Inferring population genetic structure from large-scale genotyping of single-nucleotide polymorphisms or variants is an important technique for studying the history and distribution of extant human populations, but it is also a very important tool for adjusting tests of association. However, the structures inferred depend on the minor allele frequency of the variants; this is very important when considering the phenotypic association of rare variants.Using the Genetic Analysis Workshop 18 data set for 142 unrelated individuals, which includes genotypes for many rare variants, we study the following hypothesis: the difference in detected structure is the result of a "scale" effect; that is, rare variants are likely to be shared only locally (smaller scale), while common variants can be spread over longer distances. The result is similar to that of using kernel principal component analysis, as the bandwidth of the kernel is changed. We show how different structures become evident as we consider rare or common variants.
[1]
D. Reich,et al.
Population Structure and Eigenanalysis
,
2006,
PLoS genetics.
[2]
Bernhard Schölkopf,et al.
Kernel Principal Component Analysis
,
1997,
ICANN.
[3]
D. F. Roberts,et al.
The History and Geography of Human Genes
,
1996
.
[4]
Laura Schweitzer,et al.
Advances In Kernel Methods Support Vector Learning
,
2016
.
[5]
I. Jolliffe.
Principal Component Analysis
,
2002
.
[6]
G. McVean,et al.
Differential confounding of rare and common variants in spatially structured populations
,
2011,
Nature Genetics.
[7]
P. Donnelly,et al.
Association mapping in structured populations.
,
2000,
American journal of human genetics.
[8]
P. Donnelly,et al.
Inference of population structure using multilocus genotype data.
,
2000,
Genetics.
[9]
R. Cann.
The history and geography of human genes
,
1995,
The Journal of Asian Studies.