SM2PH‐db: an interactive system for the integrated analysis of phenotypic consequences of missense mutations in proteins involved in human genetic diseases

Understanding how genetic alterations affect gene products at the molecular level represents a first step in the elucidation of the complex relationships between genotypic and phenotypic variations, and is thus a major challenge in the postgenomic era. Here, we present SM2PH‐db (http://decrypthon.igbmc.fr/sm2ph), a new database designed to investigate structural and functional impacts of missense mutations and their phenotypic effects in the context of human genetic diseases. A wealth of up‐to‐date interconnected information is provided for each of the 2,249 disease‐related entry proteins (August 2009), including data retrieved from biological databases and data generated from a Sequence–Structure–Evolution Inference in Systems‐based approach, such as multiple alignments, three‐dimensional structural models, and multidimensional (physicochemical, functional, structural, and evolutionary) characterizations of mutations. SM2PH‐db provides a robust infrastructure associated with interactive analysis tools supporting in‐depth study and interpretation of the molecular consequences of mutations, with the more long‐term goal of elucidating the chain of events leading from a molecular defect to its pathology. The entire content of SM2PH‐db is regularly and automatically updated thanks to a computational grid data federation facilities provided in the context of the Decrypthon program. Hum Mutat 31:127–135, 2010. © 2009 Wiley‐Liss, Inc.

[1]  W. Taylor,et al.  The classification of amino acid conservation. , 1986, Journal of theoretical biology.

[2]  D. Cooper,et al.  Assessing the relative importance of the biophysical properties of amino acid substitutions associated with human genetic disease , 2002, Human mutation.

[3]  International Human Genome Sequencing Consortium Finishing the euchromatic sequence of the human genome , 2004 .

[4]  Tim Hubbard Finishing the euchromatic sequence of the human genome , 2004 .

[5]  Tsviya Olender,et al.  Human Gene-Centric Databases at the Weizmann Institute of Science: GeneCards, UDB, CroW 21 and HORDE , 2003, Nucleic Acids Res..

[6]  D. Chasman,et al.  Predicting the functional consequences of non-synonymous single nucleotide polymorphisms: structure-based assessment of amino acid variation. , 2001, Journal of molecular biology.

[7]  V. McKusick Mendelian Inheritance in Man and Its Online Version, OMIM , 2007, The American Journal of Human Genetics.

[8]  Olivier Poch,et al.  PipeAlign: a new toolkit for protein family analysis , 2003, Nucleic Acids Res..

[9]  David Haussler,et al.  LS-SNP: large-scale annotation of coding non-synonymous SNPs based on multiple information sources , 2005, Bioinform..

[10]  Stylianos E. Antonarakis,et al.  Mendelian disorders deserve more attention , 2006, Nature Reviews Genetics.

[11]  S. Tavtigian,et al.  In silico analysis of missense substitutions using sequence‐alignment based methods , 2008, Human mutation.

[12]  Thierry Soussi,et al.  UMD (Universal Mutation Database): 2005 update , 2005, Human mutation.

[13]  Christian von Mering,et al.  STRING 8—a global view on proteins and their functional interactions in 630 organisms , 2008, Nucleic Acids Res..

[14]  Gene Ontology Consortium The Gene Ontology (GO) database and informatics resource , 2003 .

[15]  6th International HUGO Mutation Database Meeting, March 27, 1999, Brisbane, Australia , 1999, Human mutation.

[16]  I. Fokkema,et al.  LOVD: Easy creation of a locus‐specific sequence variation database using an “LSDB‐in‐a‐box” approach , 2005, Human mutation.

[17]  Piero Fariselli,et al.  I-Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure , 2005, Nucleic Acids Res..

[18]  M. O. Dayhoff,et al.  Atlas of protein sequence and structure , 1965 .

[19]  J. Hampe,et al.  Single base‐pair substitutions in exon–intron junctions of human genes: nature, distribution, and consequences for mRNA splicing , 2007, Human mutation.

[20]  R. Jirtle,et al.  Environmental epigenomics and disease susceptibility , 2007, Nature Reviews Genetics.

[21]  Christopher T. Saunders,et al.  Evaluation of structural and evolutionary contributions to deleterious mutation prediction. , 2002, Journal of molecular biology.

[22]  Kei Yura,et al.  coliSNP database server mapping nsSNPs on protein structures , 2007, Nucleic Acids Res..

[23]  †The International HapMap Consortium The International HapMap Project , 2003, Nature.

[24]  Alan F. Scott,et al.  McKusick's Online Mendelian Inheritance in Man (OMIM®) , 2008, Nucleic Acids Res..

[25]  Olivier Poch,et al.  Blast sampling for structural and functional analyses , 2007, BMC Bioinformatics.

[26]  Jaime Prilusky,et al.  Automated analysis of interatomic contacts in proteins , 1999, Bioinform..

[27]  Andrew C. R. Martin,et al.  Human Mutation , 2020 .

[28]  R. Redon,et al.  Relative Impact of Nucleotide and Copy Number Variation on Gene Expression Phenotypes , 2007, Science.

[29]  Simon Kasif,et al.  topoSNP: a topographic database of non-synonymous single nucleotide polymorphisms with and without known disease association , 2004, Nucleic Acids Res..

[30]  Jacob Köhler,et al.  Addressing the problems with life-science databases for traditional uses and systems biology , 2006, Nature Reviews Genetics.

[31]  M. O. Dayhoff,et al.  22 A Model of Evolutionary Change in Proteins , 1978 .

[32]  Piero Fariselli,et al.  Predicting Free Energy Contribution to the Conformational Stability of Folded Proteins From the Residue Sequence with Radial Basis Function Networks , 1995, ISMB.

[33]  R. Ranganathan,et al.  Evolutionarily conserved pathways of energetic connectivity in protein families. , 1999, Science.

[34]  Jong Bhak,et al.  SNP@Promoter: a database of human SNPs (Single Nucleotide Polymorphisms) within the putative promoter regions , 2008, BMC Bioinformatics.

[35]  A. Sali,et al.  How well can the accuracy of comparative protein structure models be predicted? , 2008, Protein science : a publication of the Protein Society.

[36]  Steven Henikoff,et al.  SIFT: predicting amino acid changes that affect protein function , 2003, Nucleic Acids Res..

[37]  Cathy H. Wu,et al.  The Universal Protein Resource (UniProt) , 2004, Nucleic Acids Res..

[38]  David Haussler,et al.  Dirichlet mixtures: a method for improved detection of weak but significant protein sequence homology , 1996, Comput. Appl. Biosci..

[39]  Patrice Koehl,et al.  MAO: a Multiple Alignment Ontology for nucleic acid and protein sequences , 2005, Nucleic acids research.

[40]  Narayanan Eswar,et al.  Protein structure modeling with MODELLER. , 2008, Methods in molecular biology.

[41]  Olivier Poch,et al.  MAGOS: multiple alignment and modelling server , 2006, Bioinform..

[42]  Olivier Poch,et al.  MACSIMS : multiple alignment of complete sequences information management system , 2006, BMC Bioinformatics.

[43]  D. Moras,et al.  Defining and characterizing protein surface using alpha shapes , 2009, Proteins.

[44]  Weatherall Dj The Phenotypic Diversity of Monogenic Disease: Lessons From the Thalassemias , 1998 .

[45]  Olivier Poch,et al.  Introduction du nouveau centre de données biomédicales Décrypthon , 2008, CORIA.

[46]  P. Kwok,et al.  Human Variome Project: an international collaboration to catalogue human genetic variation. , 2006, Pharmacogenomics.

[47]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[48]  Predrag Radivojac,et al.  MutDB: update on development of tools for the biochemical analysis of genetic variation , 2007, Nucleic Acids Res..

[49]  A. Bairoch,et al.  Annotating single amino acid polymorphisms in the UniProt/Swiss‐Prot knowledgebase , 2008, Human mutation.

[50]  Y. Zhang,et al.  IntAct—open source resource for molecular interaction data , 2006, Nucleic Acids Res..

[51]  Peng Yue,et al.  SNPs3D: Candidate gene and SNP selection for association studies , 2006, BMC Bioinformatics.

[52]  Ourania Horaitis,et al.  A database of locus-specific databases , 2007, Nature Genetics.

[53]  P. Stenson,et al.  Human Gene Mutation Database: towards a comprehensive central mutation database , 2007, Journal of Medical Genetics.

[54]  J. D. Thompson,et al.  Multiple alignment of complete sequences (MACS) in the post-genomic era. , 2001, Gene.

[55]  Barry Robson,et al.  What is a conservative substitution? , 1983, Journal of Molecular Evolution.

[56]  Michael Y. Galperin,et al.  Nucleic Acids Research annual Database Issue and the NAR online Molecular Biology Database Collection in 2009 , 2008, Nucleic Acids Res..

[57]  Mauno Vihinen,et al.  MUTbase: maintenance and analysis of distributed mutation databases , 1999, Bioinform..

[58]  J. Moult,et al.  SNPs, protein structure, and disease , 2001, Human mutation.

[59]  S. Antonarakis,et al.  Disease-causing mutations in the human genome , 2000, European Journal of Pediatrics.