Pathogenicity and selective constraint on variation near splice sites

Mutations which perturb normal pre-mRNA splicing are significant contributors to human disease. We used exome sequencing data from 7,833 probands with developmental disorders (DD) and their unaffected parents, as well as >60,000 aggregated exomes from the Exome Aggregation Consortium, to investigate selection around the splice site, and quantify the contribution of splicing mutations to DDs. Patterns of purifying selection, a deficit of variants in highly constrained genes in healthy subjects and excess de novo mutations in patients highlighted particular positions within and around the consensus splice site of greater functional relevance. Using mutational burden analyses in this large cohort of proband-parent trios, we could estimate in an unbiased manner the relative contributions of mutations at canonical dinucleotides (73%) and flanking non-canonical positions (27%), and calculated the positive predictive value of pathogenicity for different classes of mutations. We identified 18 patients with likely diagnostic de novo mutations in dominant DD-associated genes at non-canonical positions in splice sites. We estimate 35-40% of pathogenic variants in non-canonical splice site positions are missing from public databases.

[1]  Joan,et al.  Prevalence and architecture of de novo mutations in developmental disorders , 2017, Nature.

[2]  Caroline F. Wright,et al.  De novo mutations in regulatory elements in neurodevelopmental disorders , 2018, Nature.

[3]  F. Cremers,et al.  ABCA4 midigenes reveal the full splice spectrum of all reported noncanonical splice site variants in Stargardt disease , 2018, Genome Research.

[4]  Jingyue Ju,et al.  Saturation mutagenesis reveals manifold determinants of exon definition , 2018, Genome research.

[5]  Parth N. Patel,et al.  Identification of pathogenic gene mutations in LMNA and MYBPC3 that alter RNA splicing , 2017, Proceedings of the National Academy of Sciences.

[6]  M. Carmo-Fonseca,et al.  Deep intronic mutations and human disease , 2017, Human Genetics.

[7]  Funded Statistical Methods groups-AWG,et al.  Improving genetic diagnosis in Mendelian disease with transcriptome sequencing , 2017 .

[8]  Kamil J. Cygan,et al.  Pathogenic variants that alter protein code often disrupt splicing , 2017, Nature Genetics.

[9]  D. Baralle,et al.  RNA splicing in human disease and in the clinic. , 2017, Clinical science.

[10]  Deciphering Developmental Disorders Study,et al.  Prevalence and architecture of de novo mutations in developmental disorders , 2017, Nature.

[11]  Allison J. Taggart,et al.  Large-scale analysis of branchpoint usage across species and cell lines. , 2017, Genome research.

[12]  Tudor Groza,et al.  The Human Phenotype Ontology in 2017 , 2016, Nucleic Acids Res..

[13]  Francesco Muntoni,et al.  Improving genetic diagnosis in Mendelian disease with transcriptome sequencing , 2016, Science Translational Medicine.

[14]  Lenwood S. Heath,et al.  Computational Identification of Tissue-Specific Splicing Regulatory Elements in Human Genes from RNA-Seq Data , 2016, PloS one.

[15]  Pedro G. Ferreira,et al.  Sequence variation between 462 human individuals fine-tunes functional sites of RNA processing , 2016, Scientific Reports.

[16]  Debra O. Prosser,et al.  Evaluation of Bioinformatic Programmes for the Analysis of Variants within Splice Site Consensus Regions , 2016, Adv. Bioinformatics.

[17]  Minna Männikkö,et al.  Rare loss-of-function variants in SETD1A are associated with schizophrenia and developmental disorders , 2016, Nature Neuroscience.

[18]  F. Cunningham,et al.  The Ensembl Variant Effect Predictor , 2016, Genome Biology.

[19]  T. Frebourg,et al.  Exonic Splicing Mutations Are More Prevalent than Currently Estimated and Can Be Predicted by Using In Silico Tools , 2016, PLoS genetics.

[20]  Ricardo Villamarín-Salomón,et al.  ClinVar: public archive of interpretations of clinically relevant variants , 2015, Nucleic Acids Res..

[21]  James Y. Zou Analysis of protein-coding genetic variation in 60,706 humans , 2015, Nature.

[22]  M. Swanson,et al.  RNA mis-splicing in disease , 2015, Nature Reviews Genetics.

[23]  Chuangye Yan,et al.  Structural basis of pre-mRNA splicing , 2015, Science.

[24]  P. Devilee,et al.  Splicing analysis for exonic and intronic mismatch repair gene variants associated with Lynch syndrome confirms high concordance between minigene assays and patient RNA analyses , 2015, Molecular genetics & genomic medicine.

[25]  Alejandro Sifrim,et al.  Genetic diagnosis of developmental disorders in the DDD study: a scalable analysis of genome-wide research data , 2015, The Lancet.

[26]  Wilfried Haerty,et al.  Genome-wide discovery of human splicing branchpoints , 2015, Genome research.

[27]  Raphael Gottardo,et al.  Orchestrating high-throughput genomic analysis with Bioconductor , 2015, Nature Methods.

[28]  B. Frey,et al.  The human splicing code reveals new insights into the genetic determinants of disease , 2015, Science.

[29]  Tomas W. Fitzgerald,et al.  Large-scale discovery of novel genetic causes of developmental disorders , 2014, Nature.

[30]  M. Hayden,et al.  Clinical, Biochemical, and Molecular Characterization of Novel Mutations in ABCA1 in Families with Tangier Disease. , 2014, JIMD reports.

[31]  Jonathon T. Hill,et al.  Poly peak parser: Method and software for identification of unknown indels using sanger sequencing of polymerase chain reaction products , 2014, Developmental dynamics : an official publication of the American Association of Anatomists.

[32]  Eric Boerwinkle,et al.  In silico prediction of splice-altering single nucleotide variants in the human genome , 2014, Nucleic acids research.

[33]  Peter K. Rogan,et al.  Interpretation of mRNA splicing mutations in genetic disease: review of the literature and guidelines for information-theoretical analysis , 2014, F1000Research.

[34]  Boris Yamrom,et al.  The contribution of de novo coding mutations to autism spectrum disorder , 2014, Nature.

[35]  Stephan J Sanders,et al.  A framework for the interpretation of de novo mutation in human disease , 2014, Nature Genetics.

[36]  J. Shendure,et al.  A general framework for estimating the relative pathogenicity of human genetic variants , 2014, Nature Genetics.

[37]  Zefeng Wang,et al.  Systematical identification of splicing regulatory cis-elements and cognate trans-factors. , 2014, Methods.

[38]  Eric Boerwinkle,et al.  In silico tools for splicing defect prediction - A survey from the viewpoint of end-users , 2013, Genetics in Medicine.

[39]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[40]  Jean-Baptiste Cazier,et al.  Choice of transcripts and software has a large effect on variant annotation , 2014, Genome Medicine.

[41]  L. Hurst,et al.  The evolution, impact and properties of exonic splice enhancers , 2013, Genome Biology.

[42]  M. Lewandowska The missing puzzle piece: splicing mutations. , 2013, International journal of clinical and experimental pathology.

[43]  Arthur Wuster,et al.  DeNovoGear: de novo indel and point mutation discovery and phasing , 2013, Nature Methods.

[44]  Dvir Dahary,et al.  Biallelic SZT2 mutations cause infantile encephalopathy with epilepsy and dysmorphic corpus callosum. , 2013, American journal of human genetics.

[45]  J. D. den Dunnen,et al.  Exome Sequencing Identifies A Branch Point Variant in Aarskog–Scott Syndrome , 2013, Human mutation.

[46]  Bronwen L. Aken,et al.  GENCODE: The reference human genome annotation for The ENCODE Project , 2012, Genome research.

[47]  Jean-Philippe Vert,et al.  Guidelines for splicing analysis in molecular diagnosis derived from a set of 327 combined in silico/in vitro studies on BRCA1 and BRCA2 variants , 2012, Human mutation.

[48]  Vassilios Ioannidis,et al.  ExPASy: SIB bioinformatics resource portal , 2012, Nucleic Acids Res..

[49]  Pablo Cingolani,et al.  © 2012 Landes Bioscience. Do not distribute. , 2022 .

[50]  Jingyue Ju,et al.  Quantitative evaluation of all hexamers as exonic splicing elements. , 2011, Genome research.

[51]  Dominique Vaur,et al.  Contribution of bioinformatics predictions and functional splicing assays to the interpretation of unclassified variants of the BRCA genes , 2011, European Journal of Human Genetics.

[52]  Christopher W. J. Smith,et al.  Genome-Wide Association between Branch Point Properties and Alternative Splicing , 2010, PLoS Comput. Biol..

[53]  M. DePristo,et al.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.

[54]  H. Hakonarson,et al.  ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data , 2010, Nucleic acids research.

[55]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[56]  P. Schwartz,et al.  A KCNH2 branch point mutation causing aberrant splicing contributes to an explanation of genotype-negative long QT syndrome. , 2009, Heart rhythm.

[57]  R. Lidereau,et al.  Screening BRCA1 and BRCA2 unclassified variants for splicing mutations using reverse transcription PCR on patient RNA and an ex vivo assay based on a splicing reporter minigene , 2008, Journal of Medical Genetics.

[58]  K. Taylor,et al.  Genome-Wide Association , 2007, Diabetes.

[59]  J. Hampe,et al.  Single base‐pair substitutions in exon–intron junctions of human genes: nature, distribution, and consequences for mRNA splicing , 2007, Human mutation.

[60]  G. Guanti,et al.  In silico and in vivo splicing analysis of MLH1 and MSH2 missense mutations shows exon- and tissue-specific effects , 2006, BMC Genomics.

[61]  D. Baralle,et al.  NF1 mRNA biogenesis: Effect of the genomic milieu in splicing regulation of the NF1 exon 37 region , 2006, FEBS letters.

[62]  G. Ast,et al.  Comparative analysis identifies exonic splicing regulatory sequences--The complex definition of enhancers and silencers. , 2006, Molecular cell.

[63]  A. Federico,et al.  A point mutation in the lariat branch point of intron 6 of NPC1 as the cause of abnormal pre‐mRNA splicing in Niemann‐Pick type C disease , 2004, Human mutation.

[64]  L. Chasin,et al.  Computational definition of sequence motifs governing constitutive exon splicing. , 2004, Genes & development.

[65]  Terrence S. Furey,et al.  The UCSC Table Browser data retrieval tool , 2004, Nucleic Acids Res..

[66]  Christopher B. Burge,et al.  Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals , 2003, RECOMB '03.

[67]  Phillip A Sharp,et al.  Predictive Identification of Exonic Splicing Enhancers in Human Genes , 2002, Science.

[68]  A. Krainer,et al.  Listening to silence and understanding nonsense: exonic mutations that affect splicing , 2002, Nature Reviews Genetics.

[69]  Michael Q. Zhang,et al.  A mechanism for exon skipping caused by nonsense or missense mutations in BRCA1 and other genes , 2001, Nature Genetics.

[70]  X. Estivill,et al.  Mutations affecting mRNA splicing are the most common molecular defects in patients with neurofibromatosis type 1. , 2000, Human molecular genetics.

[71]  Sara G. Becker-Catania,et al.  Splicing defects in the ataxia-telangiectasia gene, ATM: underlying mutations and consequences. , 1999, American journal of human genetics.

[72]  C. Lorson,et al.  A single nucleotide in the SMN gene regulates splicing and is responsible for spinal muscular atrophy. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[73]  M. Raghunath,et al.  A rare branch-point mutation is associated with missplicing of fibrillin-2 in a large family with congenital contractural arachnodactyly. , 1997, American journal of human genetics.

[74]  Zengo Furukawa,et al.  A General Framework for , 1991 .

[75]  E. Brody,et al.  The "spliceosome": yeast pre-messenger RNA associates with a 40S complex in a splicing-dependent reaction. , 1985, Science.

[76]  R. Amann,et al.  Predictive Identification of Exonic Splicing Enhancers in Human Genes , 2022 .