Japanese population structure, based on SNP genotypes from 7003 individuals compared to other ethnic groups: effects on population-based association studies.

Because population stratification can cause spurious associations in case-control studies, understanding the population structure is important. Here, we examined Japanese population structure by "Eigenanalysis," using the genotypes for 140,387 SNPs in 7003 Japanese individuals, along with 60 European, 60 African, and 90 East-Asian individuals, in the HapMap project. Most Japanese individuals fell into two main clusters, Hondo and Ryukyu; the Hondo cluster includes most of the individuals from the main islands in Japan, and the Ryukyu cluster includes most of the individuals from Okinawa. The SNPs with the greatest frequency differences between the Hondo and Ryukyu clusters were found in the HLA region in chromosome 6. The nonsynonymous SNPs with the greatest frequency differences between the Hondo and Ryukyu clusters were the Val/Ala polymorphism (rs3827760) in the EDAR gene, associated with hair thickness, and the Gly/Ala polymorphism (rs17822931) in the ABCC11 gene, associated with ear-wax type. Genetic differentiation was observed, even among different regions in Honshu Island, the largest island of Japan. Simulation studies showed that the inclusion of different proportions of individuals from different regions of Japan in case and control groups can lead to an inflated rate of false-positive results when the sample sizes are large.

[1]  J. Pritchard,et al.  Confounding from Cryptic Relatedness in Case-Control Association Studies , 2005, PLoS genetics.

[2]  C. Hoggart,et al.  Design and analysis of admixture mapping studies. , 2004, American journal of human genetics.

[3]  J. Ott,et al.  Complement Factor H Polymorphism in Age-Related Macular Degeneration , 2005, Science.

[4]  L. Wasserman,et al.  Genomic control, a new approach to genetic-based association studies. , 2001, Theoretical population biology.

[5]  P. Armitage Tests for Linear Trends in Proportions and Frequencies , 1955 .

[6]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[7]  K. Hanihara,et al.  Dual structure model for the population history of the Japanese , 1991 .

[8]  Hiroshi Sato,et al.  Functional SNPs in the lymphotoxin-α gene that are associated with susceptibility to myocardial infarction , 2002, Nature Genetics.

[9]  Yusuke Nakamura,et al.  [BioBank Japan project]. , 2005, Nihon rinsho. Japanese journal of clinical medicine.

[10]  Hidetoshi Shimodaira,et al.  Mitochondrial genome variation in eastern Asia and the peopling of Japan. , 2004, Genome research.

[11]  D. Reich,et al.  Principal components analysis corrects for stratification in genome-wide association studies , 2006, Nature Genetics.

[12]  B Devlin,et al.  Genomic control for association studies: a semiparametric test to detect excess-haplotype sharing. , 2000, Biostatistics.

[13]  Naoyuki Kamatani,et al.  Cluster analysis and association study of structured multilocus genotype data , 2005, Journal of Human Genetics.

[14]  Yusuke Nakamura,et al.  A SNP in the ABCC11 gene is the determinant of human earwax type , 2006, Nature Genetics.

[15]  M. Olivier A haplotype map of the human genome , 2003, Nature.

[16]  W. G. Hill,et al.  Measures of human population structure show heterogeneity among genomic regions. , 2005, Genome research.

[17]  D. Reich,et al.  Detecting association in a case‐control study while correcting for population stratification , 2001, Genetic epidemiology.

[18]  M. Stephens,et al.  Interpreting principal component analyses of spatial population genetic variation , 2008, Nature Genetics.

[19]  M. Shriver,et al.  Interrogating a high-density SNP map for signatures of natural selection. , 2002, Genome research.

[20]  W. Haenszel,et al.  Statistical aspects of the analysis of data from retrospective studies of disease. , 1959, Journal of the National Cancer Institute.

[21]  Yusuke Nakamura,et al.  Gene-based SNP discovery as part of the Japanese Millennium Genome Project: identification of 190 562 genetic variations in the human genome , 2002, Journal of Human Genetics.

[22]  Simon C. Potter,et al.  Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls , 2007, Nature.

[23]  D. Reich,et al.  Population Structure and Eigenanalysis , 2006, PLoS genetics.

[24]  M. Hammer,et al.  Dual origins of the Japanese: common ground for hunter-gatherer and farmer Y chromosomes , 2006, Journal of Human Genetics.

[25]  S. Horai,et al.  mtDNA polymorphism in East Asian Populations, with special reference to the peopling of Japan. , 1996, American journal of human genetics.

[26]  M. Hammer,et al.  Y chromosomal DNA variation and the peopling of Japan. , 1995, American journal of human genetics.

[27]  M. Feldman,et al.  Worldwide Human Relationships Inferred from Genome-Wide Patterns of Variation , 2008 .

[28]  Geoffrey B. Nilsen,et al.  Whole-Genome Patterns of Common DNA Variation in Three Human Populations , 2005, Science.

[29]  Katsushi Tokunaga,et al.  A scan for genetic determinants of human hair morphology: EDAR is associated with Asian hair thickness. , 2008, Human molecular genetics.

[30]  K. Tokunaga,et al.  Analysis of HLA genes and haplotypes in Ainu (from Hokkaido, northern Japan) supports the premise that they descent from Upper Paleolithic populations of East Asia. , 2000, Tissue antigens.

[31]  Pardis C Sabeti,et al.  Genome-wide detection and characterization of positive selection in human populations , 2007, Nature.

[32]  Eric Peacock,et al.  Perlegen sciences, inc. , 2005, Pharmacogenomics.

[33]  S WRIGHT,et al.  Genetical structure of populations. , 1950, Nature.

[34]  K. Roeder,et al.  Genomic Control for Association Studies , 1999, Biometrics.

[35]  N. Saitou,et al.  Genetic origins of the Japanese: a partial support for the dual structure hypothesis. , 1997, American journal of physical anthropology.

[36]  J. Pritchard,et al.  Use of unlinked genetic markers to detect population stratification in association studies. , 1999, American journal of human genetics.

[37]  P. Donnelly,et al.  Inference of population structure using multilocus genotype data. , 2000, Genetics.