MoBiDiC Prioritization Algorithm, a Free, Accessible, and Efficient Pipeline for Single-Nucleotide Variant Annotation and Prioritization for Next-Generation Sequencing Routine Molecular Diagnosis.

Interpretation of next-generation sequencing constitutes the main limitation of molecular diagnostics. In diagnosing myopathies and muscular dystrophies, another issue is efficiency in predicting the pathogenicity of variants identified in large genes, especially TTN; current in silico prediction tools show limitations in predicting and ranking the numerous variants of such genes. We propose a variant-prioritization tool, the MoBiDiCprioritization algorithm (MPA). MPA is based on curated interpretation of data on previously reported variants, biological assumptions, and splice and missense predictors, and is used to prioritize all types of single-nucleotide variants. MPA was validated by comparing its sensitivity and specificity to those of dbNSFP database prediction tools, using a data set composed of DYSF, DMD, LMNA, NEB, and TTN variants extracted from expert-reviewed and ExAC databases. MPA obtained the best annotation rates for missense and splice variants. As MPA aggregates the results from several predictors, individual predictor errors are counterweighted, improving the sensitivity and specificity of missense and splice variant predictions. We propose a sequential use of MPA, beginning with the selection of variants with higher scores and followed by, in the absence of candidate pathologic variants, consideration of variants with lower scores. We provide scripts and documentation for free academic use and a validated annotation pipeline scaled for panel and exome sequencing to prioritize single-nucleotide variants from a VCF file.

[1]  Tom R. Gaunt,et al.  Predicting the Functional, Molecular, and Phenotypic Consequences of Amino Acid Substitutions using Hidden Markov Models , 2012, Human mutation.

[2]  Christoph Endrullat,et al.  Standardization and quality management in next-generation sequencing , 2016, Applied & translational genomics.

[3]  Alan F. Scott,et al.  Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders , 2004, Nucleic Acids Res..

[4]  S. Henikoff,et al.  Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm , 2009, Nature Protocols.

[5]  Ricardo Villamarín-Salomón,et al.  ClinVar: public archive of interpretations of clinically relevant variants , 2015, Nucleic Acids Res..

[6]  Thierry Soussi,et al.  UMD (Universal Mutation Database): 2005 update , 2005, Human mutation.

[7]  J. Miller,et al.  Predicting the Functional Effect of Amino Acid Substitutions and Indels , 2012, PloS one.

[8]  Wei Wang,et al.  SNVerGUI: a desktop tool for variant analysis of next-generation sequencing data , 2012, Journal of Medical Genetics.

[9]  Pablo Cingolani,et al.  © 2012 Landes Bioscience. Do not distribute. , 2022 .

[10]  Wen J. Li,et al.  Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation , 2015, Nucleic Acids Res..

[11]  Karsten M. Borgwardt,et al.  The Evaluation of Tools Used to Predict the Impact of Missense Variants Is Hindered by Two Types of Circularity , 2015, Human mutation.

[12]  Gabor T. Marth,et al.  An integrated map of structural variation in 2,504 human genomes , 2015, Nature.

[13]  James Y. Zou Analysis of protein-coding genetic variation in 60,706 humans , 2015, Nature.

[14]  D. MacArthur,et al.  Using high-resolution variant frequencies to empower clinical genome interpretation , 2016, Genetics in Medicine.

[15]  Hui Yang,et al.  Phenolyzer: phenotype-based prioritization of candidate genes for human diseases , 2015, Nature Methods.

[16]  Colin Campbell,et al.  An integrative approach to predicting the functional effects of non-coding and coding sequence variation , 2015, Bioinform..

[17]  John Rowell,et al.  A Rising Titan: TTN Review and Mutation Update , 2014, Human mutation.

[18]  Tom R. Gaunt,et al.  Ranking non-synonymous single nucleotide polymorphisms based on disease concepts , 2014, Human Genomics.

[19]  P. Bork,et al.  A method and server for predicting damaging missense mutations , 2010, Nature Methods.

[20]  C. Béroud,et al.  Human Splicing Finder: an online bioinformatics tool to predict splicing signals , 2009, Nucleic acids research.

[21]  Nilah Monnier,et al.  UMD‐DYSF, a novel locus specific database for the compilation and interactive analysis of mutations in the dysferlin gene , 2012, Human mutation.

[22]  Jocelyn Laporte,et al.  Mutation Update: The Spectra of Nebulin Variants and Associated Myopathies , 2014, Human mutation.

[23]  E. Boerwinkle,et al.  dbNSFP v3.0: A One‐Stop Database of Functional Predictions and Annotations for Human Nonsynonymous and Splice‐Site SNVs , 2016, Human mutation.

[24]  B. Frey,et al.  The human splicing code reveals new insights into the genetic determinants of disease , 2015, Science.

[25]  H. Hakonarson,et al.  ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data , 2010, Nucleic acids research.

[26]  Gill Bejerano,et al.  M-CAP eliminates a majority of variants of uncertain significance in clinical exomes at high sensitivity , 2016, Nature Genetics.

[27]  Jeroen F. J. Laros,et al.  LOVD v.2.0: the next generation in gene variant databases , 2011, Human mutation.

[28]  Gonçalo R. Abecasis,et al.  The variant call format and VCFtools , 2011, Bioinform..

[29]  Wyeth W Wasserman,et al.  Assessment of the ExAC data set for the presence of individuals with pathogenic genotypes implicated in severe Mendelian pediatric disorders , 2017, Genetics in Medicine.

[30]  P. Stenson,et al.  The Human Gene Mutation Database: towards a comprehensive repository of inherited mutation data for medical research, genetic diagnosis and next-generation sequencing studies , 2017, Human Genetics.

[31]  Véronique Geoffroy,et al.  Distributed under Creative Commons Cc-by 4.0 Varank: a Simple and Powerful Tool for Ranking Genetic Variants , 2022 .

[32]  Jana Marie Schwarz,et al.  MutationTaster2: mutation prediction for the deep-sequencing age , 2014, Nature Methods.

[33]  Magnus D. Vigeland,et al.  FILTUS: a desktop GUI for fast and efficient detection of disease-causing variants, including a novel autozygosity detector , 2016, Bioinform..

[34]  Caitlin Chisholm,et al.  Reinterpretation of sequence variants: one diagnostic laboratory’s experience, and the need for standard guidelines , 2017, Genetics in Medicine.

[35]  Mauricio O. Carneiro,et al.  From FastQ Data to High‐Confidence Variant Calls: The Genome Analysis Toolkit Best Practices Pipeline , 2013, Current protocols in bioinformatics.

[36]  Colleen Caleshu,et al.  Clinically impactful differences in variant interpretation between clinicians and testing laboratories: a single-center experience , 2017, Genetics in Medicine.

[37]  Bale,et al.  Standards and Guidelines for the Interpretation of Sequence Variants: A Joint Consensus Recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology , 2015, Genetics in Medicine.

[38]  Xiaoming Liu,et al.  In Silico Prediction of Deleteriousness for Nonsynonymous and Splice-Altering Single Nucleotide Variants in the Human Genome. , 2017, Methods in molecular biology.

[39]  J. Shendure,et al.  A general framework for estimating the relative pathogenicity of human genetic variants , 2014, Nature Genetics.

[40]  Keith Nykamp,et al.  Sources of discordance among germ-line variant classifications in ClinVar , 2017, Genetics in Medicine.

[41]  Jaroslav Bendl,et al.  PredictSNP2: A Unified Platform for Accurately Evaluating SNP Effects by Exploiting the Different Characteristics of Variants in Distinct Genomic Regions , 2016, PLoS Comput. Biol..

[42]  Eric Boerwinkle,et al.  In silico prediction of splice-altering single nucleotide variants in the human genome , 2014, Nucleic acids research.

[43]  Mauno Vihinen,et al.  Guidelines for Reporting and Using Prediction Tools for Genetic Variation Analysis , 2013, Human mutation.

[44]  Timothy Sterne-Weiler,et al.  Exon identity crisis: disease-causing mutations that disrupt the splicing code , 2014, Genome Biology.

[45]  F. Cunningham,et al.  The Ensembl Variant Effect Predictor , 2016, Genome Biology.

[46]  Joshua L. Deignan,et al.  ACMG clinical laboratory standards for next-generation sequencing , 2013, Genetics in Medicine.

[47]  R. Oliynyk,et al.  Future Preventive Gene Therapy of Polygenic Diseases from a Population Genetics Perspective , 2019, International journal of molecular sciences.

[48]  P. Chain,et al.  Next generation sequencing and bioinformatic bottlenecks: the current state of metagenomic data analysis. , 2012, Current opinion in biotechnology.

[49]  Christian Gilissen,et al.  Novel bioinformatic developments for exome sequencing , 2016, Human Genetics.

[50]  Justin C. Fay,et al.  Identification of deleterious mutations within three human genomes. , 2009, Genome research.

[51]  Thilo Dörk,et al.  Nonclassical splicing mutations in the coding and noncoding regions of the ATM Gene: Maximum entropy estimates of splice junction strengths , 2004, Human mutation.

[52]  Tsviya Olender,et al.  Human olfaction: from genomic variation to phenotypic diversity. , 2009, Trends in genetics : TIG.