Exploring the genetic diversity of the Japanese population: Insights from a large-scale whole genome sequencing analysis

The Japanese archipelago is a terminal location for human migration, and the contemporary Japanese people represent a unique population whose genomic diversity has been shaped by multiple migrations from Eurasia. Through high-coverage whole-genome sequencing (WGS) analysis of 9,850 samples from the National Center Biobank Network, we analyzed the genomic characteristics that define the genetic makeup of the modern Japanese population from a population genetics perspective. The dataset comprised populations from the Ryukyu Islands and other parts of the Japanese archipelago (Hondo). Low frequency detrimental or pathogenic variants were found in these populations. The Hondo population underwent two episodes of population decline during the Jomon period, corresponding to the Late Neolithic, and the Edo period, corresponding to the Early Modern era, while the Ryukyu population experienced a population decline during the shell midden period of the Late Neolithic in this region. Genes related to alcohol and lipid metabolism were affected by positive natural selection. Two genes related to alcohol metabolism were found to be 12,500 years out of phase with the time when they began to be affected by positive natural selection; this finding indicates that the genomic diversity of Japanese people has been shaped by events closely related to agriculture and food production. Author summary The human population in the Japanese archipelago exhibits significant genetic diversity, with the Ryukyu Islands and other parts of the archipelago (Hondo) having undergone distinct evolutionary paths that have contributed to the genetic divergence of the populations in each region. In this study, whole genome sequencing of healthy individuals from national research hospital biobanks was utilized to investigate the genetic diversity of the Japanese population. Haplotypes were inferred from the genomic data, and a thorough population genetic analysis was conducted. The results indicated not only genetic differentiation between Hondo and the Ryukyu Islands, but also marked differences in past population size. In addition, gene genealogies were inferred from the haplotypes, and the patterns were scrutinized for evidence of natural selection. This analysis revealed unique traces of natural selection in East Asian populations, many of which were believed to be linked to dietary changes brought about by agriculture and food production.

[1]  Ira M. Hall,et al.  High-coverage whole-genome sequencing of the expanded 1000 Genomes Project cohort including 602 trios , 2021, Cell.

[2]  Xiaoming Liu,et al.  dbNSFP v4: a comprehensive database of transcript-specific functional predictions and annotations for human nonsynonymous and splice-site SNVs , 2020, Genome Medicine.

[3]  J. Ohashi,et al.  Prefecture-level population structure of the Japanese based on SNP genotypes of 11,069 individuals , 2020, Journal of Human Genetics.

[4]  Karl R Franke,et al.  Accelerating next generation sequencing data analysis: an evaluation of optimized best practices for Genome Analysis Toolkit algorithms , 2020, Genomics & informatics.

[5]  M. Kanai,et al.  GWAS of 165,084 Japanese individuals identified nine loci associated with dietary habits , 2020, Nature Human Behaviour.

[6]  Matthew R. Robinson,et al.  Accurate, scalable and integrative haplotype estimation , 2019, Nature Communications.

[7]  Aaron J. Stern,et al.  An approximate full-likelihood method for inferring selection and allele frequency trajectories from DNA sequence data , 2019, bioRxiv.

[8]  Brian E. Cade,et al.  Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program , 2019, Nature.

[9]  S. Myers,et al.  A method for genome-wide genealogy estimation for thousands of samples , 2019, Nature Genetics.

[10]  Ryan L. Collins,et al.  The mutational constraint spectrum quantified from variation in 141,456 humans , 2020, Nature.

[11]  M. Nagasaki,et al.  Susceptibility Loci for Tanning Ability in the Japanese Population Identified by a Genome-Wide Association Study from the Tohoku Medical Megabank Project Cohort Study. , 2019, The Journal of investigative dermatology.

[12]  J. McClintick,et al.  Alcohol Dehydrogenases, Aldehyde Dehydrogenases, and Alcohol Use Disorders: A Critical Review , 2018, Alcoholism, clinical and experimental research.

[13]  Sara Mathieson,et al.  FADS1 and the Timing of Human Adaptation to Agriculture , 2018, bioRxiv.

[14]  The 100 000 Genomes Project: bringing whole genome sequencing to the NHS , 2018, British Medical Journal.

[15]  R. Yamada,et al.  HLA‐HD: An accurate HLA typing algorithm for next‐generation sequencing data , 2017, Human mutation.

[16]  Eleazar Eskin,et al.  Selection in Europeans on Fatty Acid Desaturases Associated with Dietary Changes , 2017, Molecular biology and evolution.

[17]  Jessica A. Weber,et al.  The Sentieon Genomics Tools – A fast and accurate solution to variant calling from next-generation sequence data , 2017, bioRxiv.

[18]  T. Hanihara,et al.  The allele frequency of ALDH2*Glu504Lys and ADH1B*Arg47His for the Ryukyu islanders and their history of expansion among East Asians , 2017, American journal of human biology : the official journal of the Human Biology Council.

[19]  Yun S. Song,et al.  The Simons Genome Diversity Project: 300 genomes from 142 diverse populations , 2016, Nature.

[20]  Z. Gu,et al.  Positive Selection on a Regulatory Insertion–Deletion Polymorphism in FADS2 Influences Apparent Endogenous Synthesis of Arachidonic Acid , 2016, bioRxiv.

[21]  Jun Wang,et al.  A Genetic Mechanism for Convergent Skin Lightening during Recent Human Evolution , 2016, Molecular biology and evolution.

[22]  D. Reich,et al.  Genome-wide patterns of selection in 230 ancient Eurasians , 2015, Nature.

[23]  Ricardo Villamarín-Salomón,et al.  ClinVar: public archive of interpretations of clinically relevant variants , 2015, Nucleic Acids Res..

[24]  James Y. Zou Analysis of protein-coding genetic variation in 60,706 humans , 2015, Nature.

[25]  Anders Albrechtsen,et al.  Greenlandic Inuit show genetic signatures of diet and climate adaptation , 2015, Science.

[26]  Brian L Browning,et al.  Accurate Non-parametric Estimation of Recent Effective Population Size from Segments of Identity by Descent. , 2015, American journal of human genetics.

[27]  Kengo Kinoshita,et al.  Rare variant discovery by deep whole-genome sequencing of 1,070 Japanese individuals , 2015, Nature Communications.

[28]  J. Yasuda,et al.  Japonica array: improved genotype imputation by designing a population-specific SNP array with 1070 Japanese individuals , 2015, Journal of Human Genetics.

[29]  Bjarni V. Halldórsson,et al.  Large-scale whole-genome sequencing of the Icelandic population , 2015, Nature Genetics.

[30]  Carson C Chow,et al.  Second-generation PLINK: rising to the challenge of larger and richer datasets , 2014, GigaScience.

[31]  S. Mano,et al.  Genome-wide SNP analysis reveals population structure and demographic history of the ryukyu islanders in the southern part of the Japanese archipelago. , 2014, Molecular biology and evolution.

[32]  Pieter B. T. Neerincx,et al.  Supplementary Information Whole-genome sequence variation , population structure and demographic history of the Dutch population , 2022 .

[33]  S. Gabriel,et al.  Analysis of 6,515 exomes reveals a recent origin of most human protein-coding variants , 2012, Nature.

[34]  N. Saitou,et al.  The history of human populations in the Japanese Archipelago inferred from genome-wide SNP data with a special reference to the Ainu and the Ryukyuan populations , 2012, Journal of Human Genetics.

[35]  I. Ruczinski,et al.  Adaptive Evolution of the FADS Gene Cluster within Africa , 2012, PloS one.

[36]  Joseph K. Pickrell,et al.  A Systematic Survey of Loss-of-Function Variants in Human Protein-Coding Genes , 2012, Science.

[37]  K. Kidd,et al.  A global view of the OCA2-HERC2 region and pigmentation , 2011, Human Genetics.

[38]  M. Yoneda,et al.  Pleistocene human remains from Shiraho-Saonetabaru Cave on Ishigaki Island, Okinawa, Japan, and their radiocarbon dating , 2010 .

[39]  Y. Kamatani,et al.  Functional variants in ADH1B and ALDH2 coupled with alcohol and smoking synergistically enhance esophageal cancer risk. , 2009, Gastroenterology.

[40]  K. Kidd,et al.  Origin and dispersal of atypical aldehyde dehydrogenase ALDH2487Lys. , 2009, Gene.

[41]  Yusuke Nakamura,et al.  Japanese population structure, based on SNP genotypes from 7003 individuals compared to other ethnic groups: effects on population-based association studies. , 2008, American journal of human genetics.

[42]  Kenneth K Kidd,et al.  Evidence of positive selection on a class I ADH locus. , 2007, American journal of human genetics.

[43]  K. Kidd,et al.  The evolution and population genetics of the ALDH2 locus: random genetic drift, selection, and low levels of recombination , 2004, Annals of human genetics.

[44]  D. Agarwal,et al.  Aldehyde Dehydrogenase Deficiency as Cause of Facial Flushing Reaction to Alcohol in Japanese , 1995, Alcohol health and research world.

[45]  Hisashi Suzuki Discoveries of the Fossil Man from Okinawa Island , 1975 .