The Swiss‐Prot variant page and the ModSNP database: A resource for sequence and structure information on human protein variants

Missense mutation leading to single amino acid polymorphism (SAP) is the type of mutation most frequently related to human diseases. The Swiss‐Prot protein knowledgebase records information on such mutations in various sections of a protein entry, namely in the “feature,” “comment,” and “reference” fields. To facilitate users in obtaining the most relevant information about each human SAP recorded in the knowledgebase, the Swiss‐Prot Variant web pages were created to provide a summary of available sequence information, as well as additional structural information on each variant. In particular, the ModSNP database was set up to store information related to SAPs and to manage the modeling of SAPs onto protein structures via an automatic homology modeling pipeline. Currently, among the 16,566 human SAPs recorded in the Swiss‐Prot knowledgebase (release 42.5, 21 November 2003), more than 25% have corresponding 3D‐models. Of these variants, 47% are related to disease, 26% are polymorphisms, and 27% are not yet clearly classified. The ModSNP database is updated and the subsequent model construction pipeline is launched with each weekly Swiss‐Prot release. Thus, the ModSNP database represents a valuable resource for the structural analysis of protein variation. The Swiss‐Prot variant pages are accessible from the NiceProt view of a Swiss‐Prot entry on the ExPASy server (www.expasy.org/), via a hyperlink created for the stable and unique identifier FTId of each human SAP. Hum Mutat 23:464–470, 2004. © 2004 Wiley‐Liss, Inc.

[1]  R. Stevens,et al.  The structural basis of phenylketonuria. , 1999, Molecular genetics and metabolism.

[2]  Elizabeth M. Smigielski,et al.  dbSNP: the NCBI database of genetic variation , 2001, Nucleic Acids Res..

[3]  Rolf Apweiler,et al.  The EBI SRS server-new features , 2002, Bioinform..

[4]  Ron D. Appel,et al.  ExPASy: the proteomics server for in-depth protein knowledge and analysis , 2003, Nucleic Acids Res..

[5]  A. Sali,et al.  Protein Structure Prediction and Structural Genomics , 2001, Science.

[6]  Michael J. Hartshorn,et al.  AstexViewerTM †: a visualisation aid for structure-based drug design , 2002, J. Comput. Aided Mol. Des..

[7]  V. Rotter,et al.  Oncogenic mutations of the p53 tumor suppressor: the demons of the guardian of the genome. , 2000, Cancer research.

[8]  N. Guex,et al.  SWISS‐MODEL and the Swiss‐Pdb Viewer: An environment for comparative protein modeling , 1997, Electrophoresis.

[9]  Yan P. Yuan,et al.  HGVbase: a human sequence variation database emphasizing data quality and a broad spectrum of data sources , 2002, Nucleic Acids Res..

[10]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[11]  J. Moult,et al.  SNPs, protein structure, and disease , 2001, Human mutation.

[12]  D. Valle,et al.  Online Mendelian Inheritance In Man (OMIM) , 2000, Human mutation.

[13]  S. Kasif,et al.  Structural location of disease-associated single-nucleotide polymorphisms. , 2003, Journal of molecular biology.

[14]  P. Stenson,et al.  Human Gene Mutation Database (HGMD®): 2003 update , 2003, Human mutation.

[15]  M. Orozco,et al.  Characterization of disease-associated single amino acid polymorphisms in terms of sequence and structure properties. , 2002, Journal of molecular biology.

[16]  D. Chasman,et al.  Predicting the functional consequences of non-synonymous single nucleotide polymorphisms: structure-based assessment of amino acid variation. , 2001, Journal of molecular biology.

[17]  C. Masters,et al.  Amyloid Fibril Protein Nomenclature - 2002 , 2002, Amyloid : the international journal of experimental and clinical investigation : the official journal of the International Society of Amyloidosis.

[18]  A. Lesk,et al.  The relation between the divergence of sequence and structure in proteins. , 1986, The EMBO journal.

[19]  Manuel C. Peitsch,et al.  SWISS-MODEL: an automated protein homology-modeling server , 2003, Nucleic Acids Res..

[20]  J. Greer,et al.  Model for haptoglobin heavy chain based upon structural homology. , 1980, Proceedings of the National Academy of Sciences of the United States of America.

[21]  James E. Bray,et al.  The CATH database: an extended protein family resource for structural and functional genomics , 2003, Nucleic Acids Res..

[22]  C. Dobson Protein Folding and Disease: a view from the first Horizon Symposium , 2003, Nature Reviews Drug Discovery.

[23]  D. Cooper,et al.  Assessing the relative importance of the biophysical properties of amino acid substitutions associated with human genetic disease , 2002, Human mutation.

[24]  C Venclovas,et al.  Comparison of performance in successive CASP experiments , 2001, Proteins.

[25]  Maria Jesus Martin,et al.  The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 , 2003, Nucleic Acids Res..

[26]  A. Sali,et al.  Comparative protein structure modeling of genes and genomes. , 2000, Annual review of biophysics and biomolecular structure.

[27]  D. Cooper,et al.  Human Gene Mutation Database , 1996, Human Genetics.

[28]  Rolf Apweiler,et al.  Swissknife - 'lazy parsing' of SWISS-PROT entries , 1999, Bioinform..

[29]  M. Karplus,et al.  PDB-based protein loop prediction: parameters for selection and methods for optimization. , 1997, Journal of molecular biology.

[30]  Frances M. G. Pearl,et al.  The CATH protein family database: A resource for structural and functional annotation of genomes , 2002, Proteomics.

[31]  Tim J. P. Hubbard,et al.  SCOP database in 2002: refinements accommodate structural genomics , 2002, Nucleic Acids Res..