Congruency in the prediction of pathogenic missense mutations: state-of-the-art web-based tools

A remarkable degree of genetic variation has been found in the protein-encoding regions of DNA through deep sequencing of samples obtained from thousands of subjects from several populations. Approximately half of the 20 000 single nucleotide polymorphisms present, even in normal healthy subjects, are nonsynonymous amino acid substitutions that could potentially affect protein function. The greatest challenges currently facing investigators are data interpretation and the development of strategies to identify the few gene-coding variants that actually cause or confer susceptibility to disease. A confusing array of options is available to address this problem. Unfortunately, the overall accuracy of these tools at ultraconserved positions is low, and predictions generated by current computational tools may mislead researchers involved in downstream experimental and clinical studies. First, we have presented an updated review of these tools and their primary functionalities, focusing on those that are naturally prone to analyze massive variant sets, to infer some interesting similarities among their results. Additionally, we have evaluated the prediction congruency for real whole-exome sequencing data in a proof-of-concept study on some of these web-based tools.

[1]  Judy H. Cho,et al.  Finding the missing heritability of complex diseases , 2009, Nature.

[2]  Predrag Radivojac,et al.  Automated inference of molecular mechanisms of disease from amino acid substitutions , 2009, Bioinform..

[3]  B. Peters,et al.  Distinguishing cancer-associated missense mutations from common polymorphisms. , 2007, Cancer research.

[4]  Tommaso Mazza,et al.  A solid quality-control analysis of AB SOLiD short-read sequencing data , 2013, Briefings Bioinform..

[5]  M. Vihinen,et al.  Performance of mutation pathogenicity prediction methods on missense variants , 2011, Human mutation.

[6]  P. Bork,et al.  A method and server for predicting damaging missense mutations , 2010, Nature Methods.

[7]  A. Gonzalez-Perez,et al.  Improving the assessment of the outcome of nonsynonymous SNVs with a consensus deleteriousness score, Condel. , 2011, American journal of human genetics.

[8]  Shunsuke Kato,et al.  Computational approaches for predicting the biological effect of p53 missense mutations: a comparison of three sequence analysis based methods , 2006, Nucleic acids research.

[9]  C. Sander,et al.  Predicting the functional impact of protein mutations: application to cancer genomics , 2011, Nucleic acids research.

[10]  Chia-Hung Liu,et al.  FASTSNP: an always up-to-date and extendable service for SNP function analysis and prioritization , 2006, Nucleic Acids Res..

[11]  Hagit Shatkay,et al.  F-SNP: computationally predicted functional SNPs for disease association studies , 2007, Nucleic Acids Res..

[12]  Mauno Vihinen,et al.  PON‐P: Integrated predictor for pathogenicity of missense variants , 2012, Human mutation.

[13]  Olivier Poch,et al.  KD4v: comprehensible knowledge discovery system for missense variant , 2012, Nucleic Acids Res..

[14]  Daniel Rios,et al.  Bioinformatics Applications Note Databases and Ontologies Deriving the Consequences of Genomic Variants with the Ensembl Api and Snp Effect Predictor , 2022 .

[15]  E. Boerwinkle,et al.  dbNSFP: A Lightweight Database of Human Nonsynonymous SNPs and Their Functional Predictions , 2011, Human mutation.

[16]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[17]  J. Miller,et al.  Predicting the Functional Effect of Amino Acid Substitutions and Indels , 2012, PloS one.

[18]  M. Campbell,et al.  PANTHER: a library of protein families and subfamilies indexed by function. , 2003, Genome research.

[19]  Christian Schaefer,et al.  SNPdbe: constructing an nsSNP functional impacts database , 2011, Bioinform..

[20]  Vanessa E. Gray,et al.  Evolutionary diagnosis method for variants in personal exomes , 2012, Nature Methods.

[21]  Jana Marie Schwarz,et al.  MutationTaster evaluates disease-causing potential of sequence alterations , 2010, Nature Methods.

[22]  Leyla Isik,et al.  Cancer-specific high-throughput annotation of somatic mutations: computational prediction of driver missense mutations. , 2009, Cancer research.

[23]  Nicholas R. Lemoine,et al.  SNPnexus: a web server for functional annotation of novel and publicly known genetic variants (2012 update) , 2012, Nucleic Acids Res..

[24]  Kenny Q. Ye,et al.  An integrated map of genetic variation from 1,092 human genomes , 2012, Nature.

[25]  Adam Kiezun,et al.  Computational and statistical approaches to analyzing variants identified by exome sequencing , 2011, Genome Biology.

[26]  James T. L. Mah,et al.  In silico SNP analysis and bioinformatics tools: a review of the state of the art to aid drug discovery. , 2011, Drug discovery today.

[27]  S. Henikoff,et al.  Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm , 2009, Nature Protocols.

[28]  E. Capriotti,et al.  Functional annotations improve the predictive score of human disease‐related mutations in proteins , 2009, Human mutation.

[29]  J. Shendure,et al.  Needles in stacks of needles: finding disease-causal variants in a wealth of genomic data , 2011, Nature Reviews Genetics.

[30]  Marek Kimmel,et al.  Prediction of missense mutation functionality depends on both the algorithm and sequence alignment employed , 2011, Human mutation.

[31]  Kai Wang,et al.  wANNOVAR: annotating genetic variants for personal genomes via the web , 2012, Journal of Medical Genetics.

[32]  Emidio Capriotti,et al.  Bioinformatics Original Paper Predicting the Insurgence of Human Genetic Diseases Associated to Single Point Protein Mutations with Support Vector Machines and Evolutionary Information , 2022 .

[33]  Cathy H. Wu,et al.  The Universal Protein Resource (UniProt): an expanding universe of protein information , 2005, Nucleic Acids Res..

[34]  Philip Lijnzaad,et al.  The Ensembl genome database project , 2002, Nucleic Acids Res..

[35]  Joaquín Dopazo,et al.  PupaSuite: finding functional single nucleotide polymorphisms for large-scale genotyping purposes , 2006, Nucleic Acids Res..

[36]  Gert Vriend,et al.  Protein structure analysis of mutations causing inheritable diseases. An e-Science approach with life scientist friendly interfaces , 2010, BMC Bioinformatics.

[37]  David Haussler,et al.  LS-SNP: large-scale annotation of coding non-synonymous SNPs based on multiple information sources , 2005, Bioinform..

[38]  Roland L Dunbrack,et al.  Testing computational prediction of missense mutation phenotypes: Functional characterization of 204 mutations of human cystathionine beta synthase , 2010, Proteins.

[39]  Modesto Orozco,et al.  PMUT: a web-based tool for the annotation of pathological mutations on proteins , 2005, Bioinform..

[40]  Joaquín Dopazo,et al.  SNPeffect 4.0: on-line prediction of molecular and structural effects of protein-coding variants , 2011, Nucleic Acids Res..

[41]  Steven Henikoff,et al.  SIFT: predicting amino acid changes that affect protein function , 2003, Nucleic Acids Res..

[42]  Mi Zhou,et al.  nsSNPAnalyzer: identifying disease-associated nonsynonymous single nucleotide polymorphisms , 2005, Nucleic Acids Res..

[43]  B. Rost,et al.  SNAP: predict effect of non-synonymous polymorphisms on function , 2007, Nucleic acids research.

[44]  Aaron R. Quinlan,et al.  BIOINFORMATICS APPLICATIONS NOTE , 2022 .

[45]  Elizabeth M. Smigielski,et al.  dbSNP: the NCBI database of genetic variation , 2001, Nucleic Acids Res..

[46]  H. Hakonarson,et al.  ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data , 2010, Nucleic acids research.

[47]  V. Acharya,et al.  Hansa: An automated method for discriminating disease and neutral human nsSNPs , 2012, Human mutation.

[48]  Peter Tarczy-Hornoch,et al.  SNPit: A federated data integration system for the purpose of functional SNP annotation , 2009, Comput. Methods Programs Biomed..

[49]  M. DePristo,et al.  A framework for variation discovery and genotyping using next-generation DNA sequencing data , 2011, Nature Genetics.

[50]  Rongrong Xiao,et al.  Non-neutral nonsynonymous single nucleotide polymorphisms in human ABC transporters: the first comparison of six prediction methods , 2011, Pharmacological reports : PR.

[51]  Peng Yue,et al.  SNPs3D: Candidate gene and SNP selection for association studies , 2006, BMC Bioinformatics.