Protein Sequence Variants: Resources and Tools

Originally published in: Biomedical Applications of Proteomics. Edited by Jean-Charles Sanchez, Garry L. Corthals and Denis F. Hochstrasser. Copyright © 2004 Wiley-VCH Verlag GmbH & Co. KGaA Weinheim. Print ISBN: 3-527-30807-1 The sections in this article are Introduction Medical Protein Annotation Databases Central Databases Online Mendelian Inheritance in Man (OMIM) The Human Gene Mutation Database (HGMD) The SNP Databases Advantages and Drawbacks of Central Databases Specialized Databases An Example of a Locus-specific Database: the IARC TP53 Database An Example of a Disease-oriented Specialized Database: Retina International's Scientific Newsletter – Mutation Database Other Locus-specific Databases Advantages and Drawbacks of Specialized Databases The Swiss-Prot Protein Knowledgebase and Information on Disease and Sequence Variations Gene Names Description of Diseases Proteins as Therapeutic Drugs Data on Variants Cross-references Medical-oriented Keywords Techniques of Search Challenges for Databases Analysis Tools in the Context of Protein Variants Proteomic Tools for Protein Identification and the Characterization of Variants Protein Identification Tools Peptide Characterization Tools Tools for Analyzing and/or Predicting the Effects of Protein Variants Sequence-based Analysis or Prediction Tools Structure-based Analysis or Prediction Tools The Swiss-Prot Variant Page and Comparative Modeling Remarks Conclusions Keywords: protein variations; predicting the effects; resources; databases; techniques of search; analysis tools

[1]  Arnold Munnich,et al.  Spectrum of ABCR gene mutations in autosomal recessive macular dystrophies , 1998, European Journal of Human Genetics.

[2]  C Béroud,et al.  UMD (Universal Mutation Database): A generic software to build and analyze locus‐specific databases , 2000, Human mutation.

[3]  Joost J. J. van Durme,et al.  NRMD: Nuclear Receptor Mutation Database , 2003, Nucleic Acids Res..

[4]  Bernd Wollnik,et al.  Connexin 43 (GJA1) mutations cause the pleiotropic phenotype of oculodentodigital dysplasia. , 2003, American journal of human genetics.

[5]  Chris L. Tang,et al.  Efficiency of database search for identification of mutated and modified proteins via mass spectrometry. , 2001, Genome research.

[6]  G. Mann,et al.  eMelanoBase: An online locus‐specific variant database for familial melanoma , 2003, Human mutation.

[7]  M. Orozco,et al.  Characterization of disease-associated single amino acid polymorphisms in terms of sequence and structure properties. , 2002, Journal of molecular biology.

[8]  Yan P. Yuan,et al.  HGVbase: a human sequence variation database emphasizing data quality and a broad spectrum of data sources , 2002, Nucleic Acids Res..

[9]  Andrew C R Martin,et al.  G6PDdb, an integrated database of glucose‐6‐phosphate dehydrogenase (G6PD) mutations , 2002, Human mutation.

[10]  Mauno Vihinen,et al.  KinMutBase, a database of human disease-causing protein kinase mutations , 2000, Nucleic Acids Res..

[11]  D. Valle,et al.  Online Mendelian Inheritance In Man (OMIM) , 2000, Human mutation.

[12]  Ruggero Montesano,et al.  IARC p53 mutation database: A relational database to compile and analyze p53 mutations in human tumors and cell lines , 1999, Human mutation.

[13]  D. Cooper,et al.  Assessing the relative importance of the biophysical properties of amino acid substitutions associated with human genetic disease , 2002, Human mutation.

[14]  N. Ben-Tal,et al.  ConSurf: an algorithmic tool for the identification of functional regions in proteins by surface mapping of phylogenetic information. , 2001, Journal of molecular biology.

[15]  T. Hunkapiller,et al.  Peptide mass maps: a highly informative approach to protein identification. , 1993, Analytical biochemistry.

[16]  Veronica van Heyningen,et al.  The Human PAX6 Mutation Database , 1998, Nucleic Acids Res..

[17]  Robert S. Molday,et al.  Mapping of the rod photoreceptor ABC transporter (ABCR) to 1p21–p22.1 and identification of novel mutations in Stargardt’s disease , 1998, Human Genetics.

[18]  W. Wasserman,et al.  GeneLynx: a gene-centric portal to the human genome. , 2001, Genome research.

[19]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[20]  Eran Eyal,et al.  MutaProt: a web interface for structural analysis of point mutations , 2001, Bioinform..

[21]  D. Liebler,et al.  Peptide sequence motif analysis of tandem MS data with the SALSA algorithm. , 2002, Analytical chemistry.

[22]  S. Kasif,et al.  Structural location of disease-associated single-nucleotide polymorphisms. , 2003, Journal of molecular biology.

[23]  B. J. Klevering,et al.  Mutations in the ABCA4 (ABCR) gene are the major cause of autosomal recessive cone-rod dystrophy. , 2000, American journal of human genetics.

[24]  K H Buetow,et al.  Expression-based genetic/physical maps of single-nucleotide polymorphisms identified by the cancer genome anatomy project. , 2000, Genome research.

[25]  P. Stenson,et al.  Human Gene Mutation Database—A biomedical information and research resource , 2000, Human mutation.

[26]  P. Bork,et al.  Association of genes to genetically inherited diseases using data mining , 2002, Nature Genetics.

[27]  Andreas D. Baxevanis,et al.  The Molecular Biology Database Collection: 2003 update , 2003, Nucleic Acids Res..

[28]  P. Bork,et al.  Human non-synonymous SNPs: server and survey. , 2002, Nucleic acids research.

[29]  Raymond Dalgleish,et al.  The human type I collagen mutation database , 1997, Nucleic Acids Res..

[30]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology , 2003, Nucleic Acids Res..

[31]  S. Henikoff,et al.  Predicting deleterious amino acid substitutions. , 2001, Genome research.

[32]  Ourania Horaitis,et al.  Time for a unified system of mutation description and reporting: a review of locus-specific mutation databases. , 2002, Genome research.

[33]  T. N. Bhat,et al.  The PDB data uniformity project , 2001, Nucleic Acids Res..

[34]  L Tiret,et al.  Sequence diversity in 36 candidate genes for cardiovascular disorders. , 1999, American journal of human genetics.

[35]  Yusuke Nakamura,et al.  JSNP: a database of common gene variations in the Japanese population , 2002, Nucleic Acids Res..

[36]  K. Kondo,et al.  Catalog of 605 single-nucleotide polymorphisms (SNPs) among 13 genes encoding human ATP-binding cassette transporters: ABCA4, ABCA7, ABCA8, ABCD1, ABCD3, ABCD4, ABCE1, ABCF1, ABCG1, ABCG2, ABCG4, ABCG5, and ABCG8 , 2002, Journal of Human Genetics.

[37]  R. Stevens,et al.  The structural basis of phenylketonuria. , 1999, Molecular genetics and metabolism.

[38]  Lynne Prevost,et al.  PAHdb 2003: What a locus‐specific knowledgebase can do , 2003, Human mutation.

[39]  I. Gut,et al.  Automation in genotyping of single nucleotide polymorphisms , 2001, Human mutation.

[40]  S. Balcells,et al.  Spectrum of ABCA4 (ABCR) gene mutations in Spanish patients with autosomal recessive macular dystrophies , 2001, Human mutation.

[41]  Tal Pupko,et al.  Structural Genomics , 2005 .

[42]  M Vihinen,et al.  Mutations of the human BTK gene coding for bruton tyrosine kinase in X‐linked agammaglobulinemia , 1999, Human mutation.

[43]  A. Sali,et al.  Protein Structure Prediction and Structural Genomics , 2001, Science.

[44]  Warren C. Lathe,et al.  Prediction of deleterious human alleles. , 2001, Human molecular genetics.

[45]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[46]  S. Antonarakis,et al.  Mutation nomenclature extensions and suggestions to describe complex mutations: A discussion , 2000 .

[47]  A Bairoch,et al.  High-throughput mass spectrometric discovery of protein post-translational modifications. , 1999, Journal of molecular biology.

[48]  Paul S. Bernstein,et al.  Mutation of the Stargardt Disease Gene (ABCR) in Age-Related Macular Degeneration , 1997 .

[49]  D. N. Perkins,et al.  Probability‐based protein identification by searching sequence databases using mass spectrometry data , 1999, Electrophoresis.

[50]  Kei-Hoi Cheung,et al.  ALFRED: An allele frequency database for anthropology. , 2002, American journal of physical anthropology.

[51]  D. Chasman,et al.  Predicting the functional consequences of non-synonymous single nucleotide polymorphisms: structure-based assessment of amino acid variation. , 2001, Journal of molecular biology.

[52]  C. Watanabe,et al.  Identifying proteins from two-dimensional gels by molecular mass searching of peptide fragments in protein sequence databases. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[53]  Eric S. Lander,et al.  An SNP map of the human genome generated by reduced representation shotgun sequencing , 2000, Nature.

[54]  Sue Povey,et al.  Genew: the Human Gene Nomenclature Database , 2002, Nucleic Acids Res..

[55]  J. Moult,et al.  SNPs, protein structure, and disease , 2001, Human mutation.

[56]  M. Daly,et al.  A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms , 2001, Nature.

[57]  Kei-Hoi Cheung,et al.  ALFRED: the ALelle FREquency Database. Update , 2003, Nucleic Acids Res..

[58]  Lincoln Stein,et al.  The SNP Consortium website: past, present and future , 2003, Nucleic Acids Res..

[59]  C R Scriver,et al.  PAHdb: A locus‐specific knowledgebase , 2000, Human mutation.

[60]  L. Brooks,et al.  A DNA polymorphism discovery resource for research on human genetic variation. , 1998, Genome research.

[61]  J. Felsenstein Inferring phylogenies from protein sequences by parsimony, distance, and likelihood methods. , 1996, Methods in enzymology.

[62]  J. Felsenstein Evolutionary trees from DNA sequences: A maximum likelihood approach , 2005, Journal of Molecular Evolution.

[63]  George P Patrinos,et al.  HbVar: A relational database of human hemoglobin variants and thalassemia mutations at the globin gene server , 2002, Human mutation.

[64]  S. Antonarakis Recommendations for a nomenclature system for human gene mutations , 1998 .

[65]  P. Kemmeren,et al.  A new web-based data mining tool for the identification of candidate genes for human genetic disorders , 2003, European Journal of Human Genetics.

[66]  Elizabeth M. Smigielski,et al.  dbSNP: the NCBI database of genetic variation , 2001, Nucleic Acids Res..

[67]  A J Cuticchia,et al.  Future vision of the GDB Human Genome Database , 2000, Human mutation.

[68]  Frank Dudbridge,et al.  Haplotype tagging for the identification of common disease genes , 2001, Nature Genetics.

[69]  J. Yates,et al.  Automated identification of amino acid sequence variations in proteins by HPLC/microspray tandem mass spectrometry. , 2000, Analytical chemistry.

[70]  Rolf Apweiler,et al.  VARSPLIC: alternatively-spliced protein sequences derived from SWISS-PROT and TrEMBL , 2000, Bioinform..

[71]  A. Cuticchia,et al.  Central mutation databases—A review , 2000, Human mutation.

[72]  D F Schorderet,et al.  Variation of codons 1961 and 2177 of the Stargardt disease gene is not associated with age-related macular degeneration. , 2001, Archives of ophthalmology.

[73]  A. Sali,et al.  Comparative protein structure modeling of genes and genomes. , 2000, Annual review of biophysics and biomolecular structure.

[74]  Ron D. Appel,et al.  ExPASy: the proteomics server for in-depth protein knowledge and analysis , 2003, Nucleic Acids Res..

[75]  S. Henikoff,et al.  Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[76]  C R Scriver,et al.  Proof of “disease causing” mutation , 1998, Human mutation.

[77]  A F Brown,et al.  MuStaR™ and other software for locus‐specific mutation databases , 2000, Human mutation.

[78]  N. Guex,et al.  SWISS‐MODEL and the Swiss‐Pdb Viewer: An environment for comparative protein modeling , 1997, Electrophoresis.

[79]  C. Harris,et al.  The IARC TP53 database: New online mutation analysis and recommendations to users , 2002, Human mutation.

[80]  H. Lehväslaiho,et al.  Guidelines and recommendations for content, structure, and deployment of mutation databases , 1999, Human mutation.

[81]  Michael Krawczak,et al.  The human gene mutation database , 1998, Nucleic Acids Res..

[82]  N C Dracopoli,et al.  Progress in high throughput SNP genotyping methods , 2002, The Pharmacogenomics Journal.

[83]  Jaime Prilusky,et al.  GeneCards: a novel functional genomics compendium with automated data mining and query reformulation support , 1998, Bioinform..

[84]  M. Ashburner,et al.  Sequence variation database project at the European Bioinformatics Institute , 2000, Human mutation.

[85]  P. Argos,et al.  SRS: information retrieval system for molecular biology data banks. , 1996, Methods in enzymology.

[86]  Thierry Soussi,et al.  The UMD‐p53 database: New mutations and analysis tools , 2003, Human mutation.

[87]  R. Allikmets,et al.  Further evidence for an association of ABCR alleles with age-related macular degeneration. The International ABCR Screening Consortium. , 2000, American journal of human genetics.

[88]  A Blankenagel,et al.  Complete exon-intron structure of the retina-specific ATP binding transporter gene (ABCR) allows the identification of novel mutations underlying Stargardt disease. , 1998, Genomics.

[89]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[90]  R. Cotton,et al.  Quality control in the discovery, reporting, and recording of genomic variation , 2000, Human mutation.

[91]  S. Sunyaev,et al.  PSIC: profile extraction from sequence alignments with position-specific counts of independent observations. , 1999, Protein engineering.

[92]  Itay Mayrose,et al.  Rate4Site: an algorithmic tool for the identification of functional regions in proteins by surface mapping of evolutionary determinants within their homologues , 2002, ISMB.

[93]  Christopher T. Saunders,et al.  Evaluation of structural and evolutionary contributions to deleterious mutation prediction. , 2002, Journal of molecular biology.

[94]  P. Højrup,et al.  Use of mass spectrometric molecular weight information to identify proteins in sequence databases. , 1993, Biological mass spectrometry.

[95]  Sharon Marsh,et al.  SNP databases and pharmacogenetics: great start, but a long way to go , 2002, Human mutation.

[96]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[97]  S. Henikoff,et al.  Accounting for human polymorphisms predicted to affect protein function. , 2002, Genome research.

[98]  B. Gottlieb,et al.  Variable expressivity and mutation databases: The androgen receptor gene mutations database , 2001, Human mutation.

[99]  K Rohrschneider,et al.  Autosomal recessive retinitis pigmentosa and cone-rod dystrophy caused by splice site mutations in the Stargardt's disease gene ABCR. , 1998, Human molecular genetics.

[100]  J. Gilbert,et al.  SNPing away at complex diseases: analysis of single-nucleotide polymorphisms around APOE in Alzheimer disease. , 2000, American journal of human genetics.

[101]  Mathew W. Wright,et al.  The HUGO Gene Nomenclature Committee (HGNC) , 2001, Human Genetics.

[102]  C Venclovas,et al.  Comparison of performance in successive CASP experiments , 2001, Proteins.

[103]  Raymond Dalgleish,et al.  The Human Collagen Mutation Database 1998 , 1998, Nucleic Acids Res..

[104]  A Bairoch,et al.  The human proteomics initiative (HPI). , 2001, Trends in biotechnology.

[105]  Manuel Ruiz,et al.  INFEVERS: the Registry for FMF and hereditary inflammatory disorders mutations , 2003, Nucleic Acids Res..

[106]  F. Collins,et al.  New goals for the U.S. Human Genome Project: 1998-2003. , 1998, Science.

[107]  Maria Jesus Martin,et al.  The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 , 2003, Nucleic Acids Res..

[108]  N. Shen,et al.  Patterns of single-nucleotide polymorphisms in candidate genes for blood-pressure homeostasis , 1999, Nature Genetics.

[109]  C R Scriver,et al.  Guidelines and recommendations for content, structure, and deployment of mutation databases: II. Journey in progress , 2000, Human mutation.

[110]  Ron D. Appel,et al.  The 1999 SWISS-2DPAGE database update , 2000, Nucleic Acids Res..

[111]  Kay Hofmann,et al.  A common protein interaction domain links two recently identified epilepsy genes. , 2002, Human molecular genetics.

[112]  P. Stenson,et al.  Human Gene Mutation Database (HGMD®): 2003 update , 2003, Human mutation.

[113]  R. G. H. Cotton,et al.  The HUGO Mutation Database Initiative , 1998, Science.

[114]  A T Kicman,et al.  Identification of post-translational modifications resulting from LHbeta polymorphisms by matrix-assisted laser desorption time-of-flight mass spectrometric analysis of pituitary LHbeta core fragment. , 2003, Journal of molecular endocrinology.