Deep whole-genome sequencing reveals recent selection signatures linked to evolution and disease risk of Japanese

Understanding natural selection is crucial to unveiling evolution of modern humans. Here, we report natural selection signatures in the Japanese population using 2234 high-depth whole-genome sequence (WGS) data (25.9×). Using rare singletons, we identify signals of very recent selection for the past 2000–3000 years in multiple loci (ADH cluster, MHC region, BRAP-ALDH2, SERHL2). In large-scale genome-wide association study (GWAS) dataset (n = 171,176), variants with selection signatures show enrichment in heterogeneity of derived allele frequency spectra among the geographic regions of Japan, highlighted by two major regional clusters (Hondo and Ryukyu). While the selection signatures do not show enrichment in archaic hominin-derived genome sequences, they overlap with the SNPs associated with the modern human traits. The strongest overlaps are observed for the alcohol or nutrition metabolism-related traits. Our study illustrates the value of high-depth WGS to understand evolution and their relationship with disease risk.Recent natural selection left signals in human genomes. Here, Okada et al. generate high-depth whole-genome sequence (WGS) data (25.9×) from 2,234 Japanese people of the BioBank Japan Project (BBJ), and identify signals of recent natural selection which overlap variants associated with human traits.

[1]  Or Zuk,et al.  A Composite of Multiple Signals Distinguishes Causal Variants in Regions of Positive Selection , 2010, Science.

[2]  L. Excoffier,et al.  Molecular Analysis of the β-Globin Gene Cluster in the Niokholo Mandenka Population Reveals a Recent Origin of the βS Senegal Mutation , 2002 .

[3]  T. Ogihara,et al.  Confirmation of ALDH2 as a Major locus of drinking behavior and of its variants regulating multiple metabolic phenotypes in a Japanese population. , 2011, Circulation Journal.

[4]  S. Mano,et al.  Model-based verification of hypotheses on the origin of modern Japanese revisited by Bayesian inference based on genome-wide SNP data. , 2015, Molecular biology and evolution.

[5]  R. Durbin,et al.  Inferring human population size and separation history from multiple genome sequences , 2014, Nature Genetics.

[6]  Peng Chen,et al.  Deep whole-genome sequencing of 100 southeast Asian Malays. , 2013, American journal of human genetics.

[7]  Kengo Kinoshita,et al.  Rare variant discovery by deep whole-genome sequencing of 1,070 Japanese individuals , 2015, Nature Communications.

[8]  Cameron D. Palmer,et al.  Evidence of widespread selection on standing variation in Europe at height-associated SNPs , 2012, Nature Genetics.

[9]  Joshua M. Akey,et al.  Resurrecting Surviving Neandertal Lineages from Modern Human Genomes , 2014, Science.

[10]  Y. Kamatani,et al.  Overview of the BioBank Japan Project: Study design and profile , 2017, Journal of epidemiology.

[11]  Andres Metspalu,et al.  Distribution and Medical Impact of Loss-of-Function Variants in the Finnish Founder Population , 2014, PLoS genetics.

[12]  Manuel A. R. Ferreira,et al.  Practical aspects of imputation-driven meta-analysis of genome-wide association studies. , 2008, Human molecular genetics.

[13]  J. Pritchard,et al.  A Map of Recent Positive Selection in the Human Genome , 2006, PLoS biology.

[14]  Hirotaka Matsuo,et al.  Genome-wide association study of clinically defined gout identifies multiple risk loci and its association with clinical subtypes , 2015, Annals of the rheumatic diseases.

[15]  Jun S. Liu,et al.  Genetics of rheumatoid arthritis contributes to biology and drug discovery , 2013 .

[16]  P. Visscher,et al.  Population genetic differentiation of height and body mass index across Europe , 2015, Nature Genetics.

[17]  M. Kanai,et al.  Variants at HLA-A, HLA-C, and HLA-DQB1 Confer Risk of Psoriasis Vulgaris in Japanese. , 2017, The Journal of investigative dermatology.

[18]  James Y. Zou Analysis of protein-coding genetic variation in 60,706 humans , 2015, Nature.

[19]  J. Kelso,et al.  Genetic Adaptation and Neandertal Admixture Shaped the Immune System of Human Populations , 2016, Cell.

[20]  J. Akey,et al.  A flexible method for estimating the fraction of fitness influencing mutations from large sequencing data sets , 2016, Genome research.

[21]  Pardis C Sabeti,et al.  Genome-wide detection and characterization of positive selection in human populations , 2007, Nature.

[22]  N. Shinomiya,et al.  The effects of URAT1/SLC22A12 nonfunctional variants,R90H and W258X, on serum uric acid levels and gout/hyperuricemia progression , 2016, Scientific Reports.

[23]  Yusuke Nakamura,et al.  Japanese population structure, based on SNP genotypes from 7003 individuals compared to other ethnic groups: effects on population-based association studies. , 2008, American journal of human genetics.

[24]  Swapan Mallick,et al.  The genomic landscape of Neanderthal ancestry in present-day humans. , 2016 .

[25]  Yun S. Song,et al.  Robust and scalable inference of population history from hundreds of unphased whole genomes , 2016, Nature Genetics.

[26]  Kyle J. Gaulton,et al.  Detection of human adaptation during the past 2000 years , 2016, Science.

[27]  Pardis C Sabeti,et al.  Genetic signatures of strong recent positive selection at the lactase gene. , 2004, American journal of human genetics.

[28]  Gabor T. Marth,et al.  An integrated map of structural variation in 2,504 human genomes , 2015, Nature.

[29]  Jonathan Scott Friedlaender,et al.  Excavating Neandertal and Denisovan DNA from the genomes of Melanesian individuals , 2016, Science.

[30]  Y. Kamatani,et al.  Polygenic burdens on cell-specific pathways underlie the risk of rheumatoid arthritis , 2017, Nature Genetics.

[31]  Y. Kamatani,et al.  Cross-sectional analysis of BioBank Japan clinical data: A large cohort of 200,000 patients with 47 common diseases , 2017, Journal of epidemiology.

[32]  J. Akey,et al.  Small Amounts of Archaic Admixture Provide Big Insights into Human History , 2015, Cell.

[33]  M. Kanai,et al.  Contribution of a Non-classical HLA Gene, HLA-DOA, to the Risk of Rheumatoid Arthritis. , 2016, American journal of human genetics.

[34]  Y. Kamatani,et al.  Functional variants in ADH1B and ALDH2 coupled with alcohol and smoking synergistically enhance esophageal cancer risk. , 2009, Gastroenterology.

[35]  L. Excoffier,et al.  Molecular analysis of the beta-globin gene cluster in the Niokholo Mandenka population reveals a recent origin of the beta(S) Senegal mutation. , 2002, American journal of human genetics.

[36]  S. Mano,et al.  Genome-wide SNP analysis reveals population structure and demographic history of the ryukyu islanders in the southern part of the Japanese archipelago. , 2014, Molecular biology and evolution.

[37]  M. Kanai,et al.  Construction of a population-specific HLA imputation reference panel and its application to Graves' disease risk in Japanese , 2015, Nature Genetics.

[38]  Sue Povey,et al.  Gene map of the extended human MHC , 2004, Nature Reviews Genetics.

[39]  Tom R. Gaunt,et al.  The UK10K project identifies rare variants in health and disease , 2016 .

[40]  N. Eriksson,et al.  Genome-wide association and HLA region fine-mapping studies identify susceptibility loci for multiple common infections , 2016, Nature Communications.

[41]  T. Hanihara,et al.  The allele frequency of ALDH2*Glu504Lys and ADH1B*Arg47His for the Ryukyu islanders and their history of expansion among East Asians , 2017, American journal of human biology : the official journal of the Human Biology Council.

[42]  Pardis C Sabeti,et al.  Positive Natural Selection in the Human Lineage , 2006, Science.

[43]  Zachary A. Szpiech,et al.  Genetic Ancestry and Natural Selection Drive Population Differences in Immune Responses to Pathogens , 2016, Cell.

[44]  B. Weir,et al.  ESTIMATING F‐STATISTICS FOR THE ANALYSIS OF POPULATION STRUCTURE , 1984, Evolution; international journal of organic evolution.

[45]  Asan,et al.  Sequencing of 50 Human Exomes Reveals Adaptation to High Altitude , 2010, Science.

[46]  Y. Kamatani,et al.  Low-frequency coding variants in CETP and CFB are associated with susceptibility of exudative age-related macular degeneration in the Japanese population. , 2016, Human molecular genetics.

[47]  D. Reich,et al.  Genome-wide patterns of selection in 230 ancient Eurasians , 2015, Nature.

[48]  Mauricio O. Carneiro,et al.  From FastQ Data to High‐Confidence Variant Calls: The Genome Analysis Toolkit Best Practices Pipeline , 2013, Current protocols in bioinformatics.

[49]  R. Mägi,et al.  Genetic Structure of Europeans: A View from the North–East , 2009, PloS one.

[50]  K. Kidd,et al.  Ethnic Related Selection for an ADH Class I Variant within East Asia , 2008, PloS one.

[51]  Swapan Mallick,et al.  Population Structure of UK Biobank and Ancient Eurasians Reveals Adaptation at Genes Influencing Blood Pressure. , 2016, American journal of human genetics.

[52]  T. Kemp,et al.  Identification of Serhl, a new member of the serine hydrolase family induced by passive stretch of skeletal muscle in vivo. , 2001, Genomics.

[53]  Y. Okada,et al.  A common variant of MAF/c-MAF, transcriptional factor gene in the kidney, is associated with gout susceptibility , 2017, Human Cell.

[54]  Gabor T. Marth,et al.  A global reference for human genetic variation , 2015, Nature.

[55]  M. Kanai,et al.  Genetic analysis of quantitative traits in the Japanese population links cell types to complex human diseases , 2018, Nature Genetics.

[56]  M. Kanai,et al.  Genome-wide association study identifies 112 new loci for body mass index in the Japanese population , 2017, Nature Genetics.

[57]  M. Kanai,et al.  Empirical estimation of genome-wide significance thresholds based on the 1000 Genomes Project data set , 2016, Journal of Human Genetics.