Populational landscape of INDELs affecting transcription factor-binding sites in humans

BackgroundDifferences in gene expression have a significant role in the diversity of phenotypes in humans. Here we integrated human public data from ENCODE, 1000 Genomes and Geuvadis to explore the populational landscape of INDELs affecting transcription factor-binding sites (TFBS). A significant fraction of TFBS close to the transcription start site of known genes is affected by INDELs with a consequent effect at the expression of the associated gene.ResultsHundreds of TFBS-affecting INDELs (TFBS-ID) show a differential frequency between human populations, suggesting a role of natural selection in the spread of such variant INDELs. A comparison with a dataset of known human genomic regions under natural selection allowed us to identify several cases of TFBS-ID likely involved in populational adaptations. Ontology analyses on the differential TFBS-ID further indicated several biological processes under natural selection in different populations.ConclusionTogether, our results strongly suggest that INDELs have an important role in modulating gene expression patterns in humans. The dataset we make available, together with other data reporting variability at both regulatory and coding regions of genes, represent a powerful tool for studies aiming to better understand the evolution of gene regulatory networks in humans.

[1]  Eric Jones,et al.  SciPy: Open Source Scientific Tools for Python , 2001 .

[2]  Pedro G. Ferreira,et al.  Transcriptome and genome sequencing uncovers functional variation in humans , 2013, Nature.

[3]  Timothy E. Reddy,et al.  Effects of sequence variation on differential allelic transcription factor occupancy and gene expression , 2012, Genome research.

[4]  Pablo Cingolani,et al.  © 2012 Landes Bioscience. Do not distribute. , 2022 .

[5]  Peter Dalgaard,et al.  R Development Core Team (2010): R: A language and environment for statistical computing , 2010 .

[6]  T. Misaka,et al.  Two Distinct Determinants of Ligand Specificity in T1R1/T1R3 (the Umami Taste Receptor) , 2013, The Journal of Biological Chemistry.

[7]  Dan Xie,et al.  Extensive Variation in Chromatin States Across Humans , 2013, Science.

[8]  Jianzhi Zhang,et al.  Contrasting modes of evolution between vertebrate sweet/umami receptor genes and bitter receptor genes. , 2006, Molecular biology and evolution.

[9]  Justin C. Fay,et al.  Hitchhiking under positive Darwinian selection. , 2000, Genetics.

[10]  Jun S. Liu,et al.  The Genotype-Tissue Expression (GTEx) pilot analysis: Multitissue gene regulation in humans , 2015, Science.

[11]  A. Beaudet,et al.  Immotile Sperm and Infertility in Mice Lacking Mitochondrial Voltage-dependent Anion Channel Type 3* , 2001, The Journal of Biological Chemistry.

[12]  Shane J. Neph,et al.  Personal and population genomics of human regulatory variation , 2012, Genome research.

[13]  Kenny Q. Ye,et al.  An integrated map of genetic variation from 1,092 human genomes , 2012, Nature.

[14]  G. Wray The evolutionary significance of cis-regulatory mutations , 2007, Nature Reviews Genetics.

[15]  Carlos Bustamante,et al.  Genomic scans for selective sweeps using SNP data. , 2005, Genome research.

[16]  M. Nóbrega,et al.  Beyond the ENCODE project: using genomics and epigenomics strategies to study enhancer evolution , 2013, Philosophical Transactions of the Royal Society B: Biological Sciences.

[17]  R. Nielsen,et al.  Signatures of Environmental Genetic Adaptation Pinpoint Pathogens as the Main Selective Pressure through Human Evolution , 2011, PLoS genetics.

[18]  Mark George Thomas,et al.  Direct evidence for positive selection of skin, hair, and eye pigmentation in Europeans during the last 5,000 y , 2014, Proceedings of the National Academy of Sciences.

[19]  J. Rozas,et al.  Statistical properties of new neutrality tests against population growth. , 2002, Molecular biology and evolution.

[20]  C. Tyler-Smith,et al.  Human genomic regions with exceptionally high levels of population differentiation identified from 911 whole-genome sequences , 2014, Genome Biology.

[21]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[22]  R. Sturm,et al.  Molecular genetics of human pigmentation diversity. , 2009, Human molecular genetics.

[23]  S. Jinap,et al.  Umami Taste Components and Their Sources in Asian Foods , 2015, Critical reviews in food science and nutrition.

[24]  Pavlos Pavlidis,et al.  1000 Genomes Selection Browser 1.0: a genome browser dedicated to signatures of natural selection in modern humans , 2013, Nucleic Acids Res..

[25]  S. Tishkoff,et al.  Evolutionary History and Adaptation from High-Coverage Whole-Genome Sequences of Diverse African Hunter-Gatherers , 2012, Cell.

[26]  Pardis C Sabeti,et al.  Positive Selection on a High-Sensitivity Allele of the Human Bitter-Taste Receptor TAS2R16 , 2005, Current Biology.

[27]  W. Li,et al.  Statistical tests of neutrality of mutations. , 1993, Genetics.

[28]  Cory Y. McLean,et al.  Human-specific loss of regulatory DNA and the evolution of human-specific traits , 2011, Nature.

[29]  Testing for Natural Selection in Human Exonic Splicing Regulators Associated with Evolutionary Rate Shifts , 2013, Journal of Molecular Evolution.

[30]  Ilan Gronau,et al.  Genome-wide inference of natural selection on human transcription factor binding sites , 2013, Nature Genetics.

[31]  Dennis C. Ko,et al.  Functional genetic screen of human diversity reveals that a methionine salvage enzyme regulates inflammatory cell death , 2012, Proceedings of the National Academy of Sciences.

[32]  Deborah A Nickerson,et al.  Population History and Natural Selection Shape Patterns of Genetic Variation in 132 Genes , 2004, PLoS biology.

[33]  Marc D. Perry,et al.  ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia , 2012, Genome research.

[34]  Jian Ma,et al.  Tracing the Evolution of Lineage-Specific Transcription Factor Binding Sites in a Birth-Death Framework , 2014, PLoS Comput. Biol..

[35]  Guangchuang Yu,et al.  clusterProfiler: an R package for comparing biological themes among gene clusters. , 2012, Omics : a journal of integrative biology.

[36]  S. Tishkoff,et al.  SNP ascertainment bias in population genetic analyses: Why it is important, and how to correct it , 2013, BioEssays : news and reviews in molecular, cellular and developmental biology.

[37]  F. Hu,et al.  A Genome-Wide Association Study Identifies Novel Alleles Associated with Hair Color and Skin Pigmentation , 2008, PLoS genetics.

[38]  K. White,et al.  Adaptive Evolution and the Birth of CTCF Binding Sites in the Drosophila Genome , 2012, PLoS biology.

[39]  Laurent Excoffier,et al.  Evidence for polygenic adaptation to pathogens in the human genome. , 2013, Molecular biology and evolution.

[40]  M. Gerstein,et al.  Variation in Transcription Factor Binding Among Humans , 2010, Science.

[41]  F. Tajima Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. , 1989, Genetics.

[42]  Shane J. Neph,et al.  An expansive human regulatory lexicon encoded in transcription factor footprints , 2012, Nature.

[43]  Lukasz Huminiecki,et al.  A simple metric of promoter architecture robustly predicts expression breadth of human genes suggesting that most transcription factors are positive regulators , 2014, Genome Biology.

[44]  D. Haussler,et al.  Cantly Associated with Increased Likelihood of References and Notes Supporting Online Material Materials and Methods Figs. S1 to S4 Tables S1 to S15 References Three Periods of Regulatory Innovation during Vertebrate Evolution , 2022 .

[45]  M. DePristo,et al.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.