A hidden layer of structural variation in transposable elements reveals potential genetic modifiers in human disease-risk loci

Genome-wide association studies (GWAS) have been highly informative in discovering disease-associated loci but are not designed to capture all structural variations in the human genome. Using long-read sequencing data, we discovered widespread structural variation within SINE-VNTR-Alu (SVA) elements, a class of great ape-specific transposable elements with gene-regulatory roles, which represents a major source of structural variability in the human population. We highlight the presence of structurally variable SVAs (SV-SVAs) in neurological disease–associated loci, and we further associate SV-SVAs to disease-associated SNPs and differential gene expression using luciferase assays and expression quantitative trait loci data. Finally, we genetically deleted SV-SVAs in the BIN1 and CD2AP Alzheimer's disease–associated risk loci and in the BCKDK Parkinson's disease–associated risk locus and assessed multiple aspects of their gene-regulatory influence in a human neuronal context. Together, this study reveals a novel layer of genetic variation in transposable elements that may contribute to identification of the structural variants that are the actual drivers of disease associations of GWAS loci.

[1]  M. Tavallaei,et al.  Recent innovations and in-depth aspects of post-genome wide association study (Post-GWAS) to understand the genetic basis of complex phenotypes , 2021, Heredity.

[2]  J. Marchini,et al.  Exome sequencing and analysis of 454,787 UK Biobank participants , 2021, Nature.

[3]  W. M. van der Flier,et al.  Common variants in Alzheimer’s disease and risk stratification by polygenic risk scores , 2021, Nature Communications.

[4]  J. Quinn,et al.  Reference SVA insertion polymorphisms are associated with Parkinson’s Disease progression and differential gene expression , 2021, NPJ Parkinson's disease.

[5]  M. Smidt,et al.  ZNF91 deletion in human embryonic stem cells leads to ectopic activation of SVA retrotransposons and up-regulation of KRAB zinc finger gene clusters , 2021, Genome research.

[6]  William T. Harvey,et al.  Haplotype-resolved diverse human genomes and integrated analysis of structural variation , 2021, Science.

[7]  Daniel J. Gaffney,et al.  Genome-wide meta-analysis, fine-mapping, and integrative prioritization implicate new Alzheimer’s disease risk genes , 2021, Nature Genetics.

[8]  William T. Harvey,et al.  Fully phased human genome assembly without parental data using single-cell strand sequencing and long reads , 2020, Nature Biotechnology.

[9]  G. Faulkner,et al.  Nanopore Sequencing Enables Comprehensive Transposable Element Epigenomic Profiling. , 2020, Molecular cell.

[10]  J. Korlach,et al.  Extreme enrichment of VNTR-associated polymorphicity in human subtelomeres: genes with most VNTRs are predominantly expressed in the brain , 2020, Translational Psychiatry.

[11]  Ting Wang,et al.  Tissue-specific usage of transposable element-derived promoters in mouse development , 2020, Genome biology.

[12]  M. Nalls,et al.  The Parkinson's Disease Genome‐Wide Association Study Locus Browser , 2020, Movement disorders : official journal of the Movement Disorder Society.

[13]  Fidel Ramírez,et al.  pyGenomeTracks: reproducible plots for multivariate genomic datasets , 2020, Bioinform..

[14]  Tariq Ahmad,et al.  A structural variation reference for medical and population genetics , 2020, Nature.

[15]  J. Wysocka,et al.  Transposable elements as a potent source of diverse cis-regulatory sequences in mammalian genomes , 2020, Philosophical Transactions of the Royal Society B.

[16]  J. Quinn,et al.  The Role of SINE-VNTR-Alu (SVA) Retrotransposons in Shaping the Human Genome , 2019, International journal of molecular sciences.

[17]  Kohske Takahashi,et al.  Welcome to the Tidyverse , 2019, J. Open Source Softw..

[18]  R. Irizarry ggplot2 , 2019, Introduction to Data Science.

[19]  Mark J. P. Chaisson,et al.  Human-specific tandem repeat expansion and differential gene expression during primate evolution , 2019, Proceedings of the National Academy of Sciences.

[20]  Steven L Salzberg,et al.  Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype , 2019, Nature Biotechnology.

[21]  K. Burns,et al.  Transposable elements in human genetic disease , 2019, Nature Reviews Genetics.

[22]  R. Jaenisch,et al.  Hominoid-Specific Transposable Elements and KZFPs Facilitate Human Embryonic Genome Activation and Control Transcription in Naive Human ESCs , 2019, Cell stem cell.

[23]  Ryan L. Collins,et al.  Multi-platform discovery of haplotype-resolved structural variation in human genomes , 2017, Nature Communications.

[24]  Ian T. Fiddes,et al.  Structurally Conserved Primate LncRNAs Are Transiently Expressed during Human Cortical Differentiation and Influence Cell-Type-Specific Genes , 2019, Stem cell reports.

[25]  Timothy J. Hohman,et al.  Genome-wide meta-analysis identifies new loci and functional pathways influencing Alzheimer’s disease risk , 2019, Nature Genetics.

[26]  Evan E. Eichler,et al.  Characterizing the Major Structural Variant Alleles of the Human Genome , 2019, Cell.

[27]  L. Jorde,et al.  Pedigree-based estimation of human mobile element retrotransposition rates , 2018, bioRxiv.

[28]  David Haussler,et al.  The UCSC repeat browser allows discovery and visualization of evolutionary conflict across repeat families , 2018, Mobile DNA.

[29]  Hans-Ulrich Klein,et al.  Tau Activates Transposable Elements in Alzheimer’s Disease , 2018, Cell reports.

[30]  Christopher D. Brown,et al.  Transposable elements generate regulatory novelty in a tissue-specific fashion , 2018, BMC Genomics.

[31]  I. Pogribny,et al.  Overexpression of LINE-1 Retrotransposons in Autism Brain , 2018, Molecular Neurobiology.

[32]  R. Tearle,et al.  Whole-genome sequencing reveals principles of brain retrotransposition in neurodevelopmental disorders , 2018, Cell Research.

[33]  Trisha J. Multhaupt-Buell,et al.  Disease onset in X-linked dystonia-parkinsonism correlates with expansion of a hexameric repeat within an SVA retrotransposon in TAF1 , 2017, Proceedings of the National Academy of Sciences.

[34]  M. Murray,et al.  Parkinson's disease susceptibility variants and severity of Lewy body pathology. , 2017, Parkinsonism & related disorders.

[35]  M. Nalls,et al.  A meta-analysis of genome-wide association studies identifies 17 new Parkinson's disease risk loci , 2017, Nature Genetics.

[36]  Christopher D. Brown,et al.  Transposable elements are the primary source of novelty in primate gene regulation , 2017, Genome research.

[37]  Jef D. Boeke,et al.  Structural variants caused by Alu insertions are associated with risks for many human diseases , 2017, Proceedings of the National Academy of Sciences.

[38]  Kin Chung Lam,et al.  High-resolution TADs reveal DNA sequences underlying genome organization in flies , 2017, Nature Communications.

[39]  R. J. Kelleher,et al.  Presenilin-1 mutations and Alzheimer’s disease , 2017, Proceedings of the National Academy of Sciences.

[40]  Jane Y. Wu,et al.  PINK1 and Parkin are genetic modifiers for FUS-induced neurodegeneration. , 2016, Human molecular genetics.

[41]  Helen E. Parkinson,et al.  The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog) , 2016, Nucleic Acids Res..

[42]  H. Kazazian,et al.  Roles for retrotransposon insertions in human disease , 2016, Mobile DNA.

[43]  John Chilton,et al.  The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update , 2016, Nucleic Acids Res..

[44]  Fidel Ramírez,et al.  deepTools2: a next generation web server for deep-sequencing data analysis , 2016, Nucleic Acids Res..

[45]  Junjian Zhang,et al.  Meta-analysis of BACE1 gene rs638405 polymorphism and the risk of Alzheimer’s disease in Caucasion and Asian population , 2016, Neuroscience Letters.

[46]  C. Feschotte,et al.  Regulatory evolution of innate immunity through co-option of endogenous retroviruses , 2016, Science.

[47]  Giulio Genovese,et al.  Schizophrenia risk from complex variation of complement component 4 , 2016, Nature.

[48]  Mitchell J. Machiela,et al.  LDlink: a web-based application for exploring population-specific haplotype structure and linking correlated alleles of possible functional variants , 2015, Bioinform..

[49]  Gabi Kastenmüller,et al.  SNiPA: an interactive, genetic variant-centered annotation browser , 2014, Bioinform..

[50]  W. Huber,et al.  Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 , 2014, Genome Biology.

[51]  M. Creyghton,et al.  Large-scale identification of coregulated enhancer networks in the adult human brain. , 2014, Cell reports.

[52]  L. Hurst,et al.  Primate-specific endogenous retrovirus-driven transcription defines naive-like stem cells , 2014, Nature.

[53]  Margaret A. Pericak-Vance,et al.  Genome-Wide Association Meta-analysis of Neuropathologic Features of Alzheimer's Disease and Related Dementias , 2014, PLoS genetics.

[54]  Anthony J. Geneva,et al.  SIRT6 represses LINE1 retrotransposons by ribosylating KAP1 but this repression fails with stress and age , 2014, Nature Communications.

[55]  David Haussler,et al.  An evolutionary arms race between KRAB zinc finger genes 91/93 and SVA/L1 retrotransposons , 2014, Nature.

[56]  Chuong B. Do,et al.  Large-scale meta-analysis of genome-wide association data identifies six new risk loci for Parkinson’s disease , 2014, Nature Genetics.

[57]  J. Jankovic,et al.  The role of FUS gene variants in neurodegenerative diseases , 2014, Nature Reviews Neurology.

[58]  Björn Usadel,et al.  Trimmomatic: a flexible trimmer for Illumina sequence data , 2014, Bioinform..

[59]  G. Breen,et al.  An Evaluation of a SVA Retrotransposon in the FUS Promoter as a Transcriptional Regulator and Its Association to ALS , 2014, PloS one.

[60]  Robert C. Green,et al.  Genome-wide association study of the rate of cognitive decline in Alzheimer's disease , 2014, Alzheimer's & Dementia.

[61]  Daniel R. Zerbino,et al.  WiggleTools: parallel processing of large collections of genome-wide datasets for visualization and statistical analysis , 2013, Bioinform..

[62]  A. Dunning,et al.  Beyond GWASs: illuminating the dark road from association to function. , 2013, American journal of human genetics.

[63]  Nick C Fox,et al.  Meta-analysis of 74,046 individuals identifies 11 new susceptibility loci for Alzheimer's disease , 2013, Nature Genetics.

[64]  Robert Gentleman,et al.  Software for Computing and Annotating Genomic Ranges , 2013, PLoS Comput. Biol..

[65]  G. Breen,et al.  Characterisation of the potential function of SVA retrotransposons to modulate gene expression patterns , 2013, BMC Evolutionary Biology.

[66]  Wei Shi,et al.  featureCounts: an efficient general purpose program for assigning sequence reads to genomic features , 2013, Bioinform..

[67]  K. Brookes,et al.  The VNTR in complex disorders: the forgotten polymorphisms? A functional way forward? , 2013, Genomics.

[68]  J. Dubnau,et al.  Activation of transposable elements during aging and neuronal decline in Drosophila , 2013, Nature Neuroscience.

[69]  Zev N. Kronenberg,et al.  Transposable Elements Are Major Contributors to the Origin, Diversification, and Regulation of Vertebrate Long Noncoding RNAs , 2013, PLoS genetics.

[70]  Sara Hillenmeyer,et al.  Genomes of replicatively senescent cells undergo global epigenetic changes leading to gene silencing and activation of transposable elements , 2013, Aging cell.

[71]  M. Owen,et al.  Increased expression of BIN1 mediates Alzheimer genetic risk by modulating tau pathology , 2013, Molecular Psychiatry.

[72]  Tariq Ahmad Masoodi,et al.  Exploration of deleterious single nucleotide polymorphisms in late-onset Alzheimer disease susceptibility genes. , 2013, Gene.

[73]  K. Okonechnikov,et al.  Unipro UGENE , 2012 .

[74]  Cole Trapnell,et al.  Targeted RNA sequencing reveals the deep complexity of the human transcriptome , 2011, Nature Biotechnology.

[75]  Hadley Wickham,et al.  The Split-Apply-Combine Strategy for Data Analysis , 2011 .

[76]  Hilkka Soininen,et al.  Evidence of the association of BIN1 and PICALM with the AD risk in contrasting European populations , 2011, Neurobiology of Aging.

[77]  M. Frith,et al.  Adaptive seeds tame genomic sequence comparison. , 2011, Genome research.

[78]  Holly Soares,et al.  Meta-Analysis for Genome-Wide Association Study Identifies Multiple Variants at the BIN1 Locus Associated with Late-Onset Alzheimer's Disease , 2011, PloS one.

[79]  E. Wijsman,et al.  Genome-Wide Association of Familial Late-Onset Alzheimer's Disease Replicates BIN1 and CLU and Nominates CUGBP2 in Interaction with APOE , 2011, PLoS genetics.

[80]  A. Nekrutenko,et al.  Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences , 2010, Genome Biology.

[81]  J. Nutt,et al.  Common genetic variation in the HLA region is associated with late-onset sporadic Parkinson’s disease , 2010, Nature Genetics.

[82]  Sudha Seshadri,et al.  Genome-wide analysis of genetic loci associated with Alzheimer disease. , 2010, JAMA.

[83]  Sonja W. Scholz,et al.  Genome-Wide Association Study reveals genetic risk underlying Parkinson’s disease , 2009, Nature Genetics.

[84]  M. Batzer,et al.  The impact of retrotransposons on human genome evolution , 2009, Nature Reviews Genetics.

[85]  Bartek Wilczynski,et al.  Biopython: freely available Python tools for computational molecular biology and bioinformatics , 2009, Bioinform..

[86]  J. Haines,et al.  Mutations in the FUS/TLS Gene on Chromosome 16 Cause Familial Amyotrophic Lateral Sclerosis , 2009, Science.

[87]  Xun Hu,et al.  Mutations in FUS, an RNA Processing Protein, Cause Familial Amyotrophic Lateral Sclerosis Type 6 , 2009, Science.

[88]  A. Visel,et al.  ChIP-seq accurately predicts tissue-specific activity of enhancers , 2009, Nature.

[89]  Yoshiki Sasai,et al.  Self-organized formation of polarized cortical tissues from ESCs and its active manipulation by extrinsic signals. , 2008, Cell stem cell.

[90]  D. King,et al.  Simple sequence repeats: genetic modulators of brain function and behavior , 2008, Trends in Neurosciences.

[91]  Hadley Wickham,et al.  Reshaping Data with the reshape Package , 2007 .

[92]  Katsuhito Yasuno,et al.  Reduced neuron-specific expression of the TAF1 gene is associated with X-linked dystonia-parkinsonism. , 2007, American journal of human genetics.

[93]  J. Jankovic,et al.  The role of Nurr1 in the development of dopaminergic neurons and Parkinson's disease , 2005, Progress in Neurobiology.

[94]  E. Ostertag,et al.  SVA elements are nonautonomous retrotransposons that cause disease in humans. , 2003, American journal of human genetics.

[95]  J. V. Moran,et al.  Hot L1s account for the bulk of retrotransposition in the human population , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[96]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[97]  David W. Foltz,et al.  Amplification dynamics of human-specific (HS) Alu family members , 1991, Nucleic Acids Res..

[98]  S. Antonarakis,et al.  Haemophilia A resulting from de novo insertion of L1 sequences represents a novel mechanism for mutation in man , 1988, Nature.

[99]  Eugene W. Myers,et al.  Optimal alignments in linear space , 1988, Comput. Appl. Biosci..

[100]  L. Tan,et al.  MS4A Cluster in Alzheimer’s Disease , 2014, Molecular Neurobiology.

[101]  P. S. St George-Hyslop,et al.  This month in archives of neurology. , 2012, Archives of neurology.

[102]  Ira M. Hall,et al.  BEDTools: a flexible suite of utilities for comparing genomic features , 2010, Bioinform..

[103]  Tanya M. Teslovich,et al.  LocusZoom: regional visualization of genome-wide association scan results , 2010, Bioinform..