Database tool MSV3d: database of human MisSense variants mapped to 3D protein structure

The elucidation of the complex relationships linking genotypic and phenotypic variations to protein structure is a major challenge in the post-genomic era. We present MSV3d (Database of human MisSense Variants mapped to 3D protein structure), a new database that contains detailed annotation of missense variants of all human proteins (20 199 proteins). The multi-level characterization includes details of the physico-chemical changes induced by amino acid modification, as well as information related to the conservation of the mutated residue and its position relative to functional features in the available or predicted 3D model. Major releases of the database are automatically generated and updated regularly in line with the dbSNP (database of Single Nucleotide Polymorphism) and SwissVar releases, by exploiting the extensive Decrypthon computational grid resources. The database (http://decrypthon.igbmc.fr/msv3d) is easily accessible through a simple web interface coupled to a powerful query engine and a standard web service. The content is completely or partially downloadable in XML or flat file formats.

[1]  Tin Wee Tan,et al.  Towards BioDBcore: a community-defined information specification for biological databases , 2010, Database J. Biol. Databases Curation.

[2]  Olivier Poch,et al.  MACSIMS : multiple alignment of complete sequences information management system , 2006, BMC Bioinformatics.

[3]  Peng Yue,et al.  SNPs3D: Candidate gene and SNP selection for association studies , 2006, BMC Bioinformatics.

[4]  J. Thompson,et al.  DbClustal: rapid and reliable global multiple alignments of protein sequences detected by database searches. , 2000, Nucleic acids research.

[5]  N. Wicker,et al.  Density of points clustering, application to transcriptomic data analysis. , 2002, Nucleic acids research.

[6]  Piero Fariselli,et al.  I-Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure , 2005, Nucleic Acids Res..

[7]  Olivier Poch,et al.  LEON: multiple aLignment Evaluation Of Neighbours. , 2004, Nucleic acids research.

[8]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[9]  S. Mundlos,et al.  The Human Phenotype Ontology , 2010, Clinical genetics.

[10]  Peter B. McGarvey,et al.  UniRef: comprehensive and non-redundant UniProt reference clusters , 2007, Bioinform..

[11]  François Stricher,et al.  SNPeffect: a database mapping molecular phenotypic effects of human non-synonymous coding SNPs , 2004, Nucleic Acids Res..

[12]  Robert M. Hanson,et al.  Jmol – a paradigm shift in crystallographic visualization , 2010 .

[13]  Gert Vriend,et al.  Everyday , 2020, Oxford Research Encyclopedia of Literature.

[14]  Olivier Poch,et al.  Blast sampling for structural and functional analyses , 2007, BMC Bioinformatics.

[15]  V. McKusick Mendelian Inheritance in Man and Its Online Version, OMIM , 2007, The American Journal of Human Genetics.

[16]  Anaïs Mottaz,et al.  Bioinformatics Applications Note Databases and Ontologies Easy Retrieval of Single Amino-acid Polymorphisms and Phenotype Information Using Swissvar , 2022 .

[17]  Tim J. P. Hubbard,et al.  Data growth and its impact on the SCOP database: new developments , 2007, Nucleic Acids Res..

[18]  Debasis Dash,et al.  HGVbaseG2P: a central genetic association database , 2008, Nucleic Acids Res..

[19]  J. D. Thompson,et al.  Multiple alignment of complete sequences (MACS) in the post-genomic era. , 2001, Gene.

[20]  Elizabeth M. Smigielski,et al.  dbSNP: the NCBI database of genetic variation , 2001, Nucleic Acids Res..

[21]  W. Taylor,et al.  The classification of amino acid conservation. , 1986, Journal of theoretical biology.

[22]  Narayanan Eswar,et al.  Protein structure modeling with MODELLER. , 2008, Methods in molecular biology.

[23]  J. D. Thompson,et al.  Towards a reliable objective function for multiple sequence alignments. , 2001, Journal of molecular biology.

[24]  Simon Kasif,et al.  topoSNP: a topographic database of non-synonymous single nucleotide polymorphisms with and without known disease association , 2004, Nucleic Acids Res..

[25]  Rachel Karchin,et al.  Next generation tools for the annotation of human SNPs , 2009, Briefings Bioinform..

[26]  S. Tavtigian,et al.  In silico analysis of missense substitutions using sequence‐alignment based methods , 2008, Human mutation.

[27]  Jaime Prilusky,et al.  Automated analysis of interatomic contacts in proteins , 1999, Bioinform..

[28]  Melissa S. Cline,et al.  Using bioinformatics to predict the functional impact of SNVs , 2011, Bioinform..

[29]  Olivier Poch,et al.  EvoluCode: Evolutionary Barcodes as a Unifying Framework for Multilevel Evolutionary Data , 2011, Evolutionary bioinformatics online.

[30]  Olivier Poch,et al.  Décrypthon Grid - Grid Resources Dedicated to Neuromuscular Disorders , 2010, HealthGrid.

[31]  Alexander V. Diemand,et al.  The Swiss‐Prot variant page and the ModSNP database: A resource for sequence and structure information on human protein variants , 2004, Human mutation.

[32]  D. Chasman,et al.  Predicting the functional consequences of non-synonymous single nucleotide polymorphisms: structure-based assessment of amino acid variation. , 2001, Journal of molecular biology.

[33]  L. Holm,et al.  The Pfam protein families database , 2005, Nucleic Acids Res..

[34]  N. Wicker,et al.  Secator: a program for inferring protein subfamilies from phylogenetic trees. , 2001, Molecular biology and evolution.

[35]  Olivier Poch,et al.  M-ORBIS: Mapping of mOleculaR Binding sItes and Surfaces , 2010, Nucleic Acids Res..

[36]  Olivier Poch,et al.  SM2PH‐db: an interactive system for the integrated analysis of phenotypic consequences of missense mutations in proteins involved in human genetic diseases , 2010, Human mutation.

[37]  Arek Kasprzyk,et al.  BioMart: driving a paradigm change in biological data management , 2011, Database J. Biol. Databases Curation.

[38]  S. Sunyaev,et al.  Human allelic variation: perspective from protein function, structure, and evolution. , 2010, Current opinion in structural biology.

[39]  P. Bork,et al.  A method and server for predicting damaging missense mutations , 2010, Nature Methods.

[40]  M. Vihinen,et al.  Pathogenic or not? And if so, then how? Studying the effects of missense mutations using bioinformatics methods , 2009, Human mutation.

[41]  Predrag Radivojac,et al.  MutDB: update on development of tools for the biochemical analysis of genetic variation , 2007, Nucleic Acids Res..

[42]  Valentin A. Ilyin,et al.  Structure SNP (StSNP): a web server for mapping and modeling nsSNPs on protein structures with linkage to metabolic pathways , 2007, Nucleic Acids Res..

[43]  S. Henikoff,et al.  Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm , 2009, Nature Protocols.

[44]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[45]  Olivier Poch,et al.  PipeAlign: a new toolkit for protein family analysis , 2003, Nucleic Acids Res..

[46]  Olivier Poch,et al.  RASCAL: Rapid Scanning and Correction of Multiple Sequence Alignments , 2003, Bioinform..

[47]  David Haussler,et al.  LS-SNP: large-scale annotation of coding non-synonymous SNPs based on multiple information sources , 2005, Bioinform..