SPdb – a signal peptide database

BackgroundThe signal peptide plays an important role in protein targeting and protein translocation in both prokaryotic and eukaryotic cells. This transient, short peptide sequence functions like a postal address on an envelope by targeting proteins for secretion or for transfer to specific organelles for further processing. Understanding how signal peptides function is crucial in predicting where proteins are translocated. To support this understanding, we present SPdb signal peptide database http://proline.bic.nus.edu.sg/spdb, a repository of experimentally determined and computationally predicted signal peptides.ResultsSPdb integrates information from two sources (a) Swiss-Prot protein sequence database which is now part of UniProt and (b) EMBL nucleotide sequence database. The database update is semi-automated with human checking and verification of the data to ensure the correctness of the data stored. The latest release SPdb release 3.2 contains 18,146 entries of which 2,584 entries are experimentally verified signal sequences; the remaining 15,562 entries are either signal sequences that fail to meet our filtering criteria or entries that contain unverified signal sequences.ConclusionSPdb is a manually curated database constructed to support the understanding and analysis of signal peptides. SPdb tracks the major updates of the two underlying primary databases thereby ensuring that its information remains up-to-date.

[1]  Tatiana A. Tatusova,et al.  NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..

[2]  G. Blobel,et al.  Chicken ovalbumin contains an internal signal sequence , 1979, Nature.

[3]  Simon Stobart,et al.  The MySQL Database Management System , 2002 .

[4]  E Pennisi,et al.  Keeping Genome Databases Clean and Up to Date , 1999, Science.

[5]  P. Bork Powers and pitfalls in sequence analysis: the 70% hurdle. , 2000, Genome research.

[6]  G von Heijne,et al.  A 30-residue-long "export initiation domain" adjacent to the signal sequence is critical for protein translocation across the inner membrane of Escherichia coli. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Utpal Tatu Nobel Prize in Physiology or Medicine 1999 , 2000 .

[8]  R. Doolittle,et al.  A simple method for displaying the hydropathic character of a protein. , 1982, Journal of molecular biology.

[9]  Tatiana Tatusova,et al.  NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..

[10]  S. Brunak,et al.  Improved prediction of signal peptides: SignalP 3.0. , 2004, Journal of molecular biology.

[11]  D. Eisenberg,et al.  Correlation of sequence hydrophobicities measures similarity in three-dimensional protein structure. , 1983, Journal of molecular biology.

[12]  R. Doebele,et al.  PrlA and PrlG suppressors reduce the requirement for signal sequence recognition , 1994, Journal of bacteriology.

[13]  A. Krogh,et al.  A combined transmembrane topology and signal peptide prediction method. , 2004, Journal of molecular biology.

[14]  Tom Mistelli,et al.  Nuclear protein database (NPD) , 2002 .

[15]  George S. Michaels,et al.  Should software hold data hostage? , 2004, Nature Biotechnology.

[16]  Zemin Zhang,et al.  Signal peptide prediction based on analysis of experimentally verified cleavage sites , 2004, Protein science : a publication of the Protein Society.

[17]  G von Heijne,et al.  Differential use of the signal recognition particle translocase targeting pathway for inner membrane protein assembly in Escherichia coli. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[18]  David Eisenberg,et al.  The helical hydrophobic moment: a measure of the amphiphilicity of a helix , 1982, Nature.

[19]  G. A. Bowden,et al.  Abnormal fractionation of beta-lactamase in Escherichia coli: evidence for an interaction with the inner membrane in the absence of a leader peptide , 1992, Journal of bacteriology.

[20]  Simon C. Potter,et al.  An overview of Ensembl. , 2004, Genome research.

[21]  Dieter Jahn,et al.  PrediSi: prediction of signal peptides and their cleavage positions , 2004, Nucleic Acids Res..

[22]  Amos Bairoch,et al.  Swiss-Prot: Juggling between evolution and stability , 2004, Briefings Bioinform..

[23]  Somasekar Seshagiri,et al.  The secreted protein discovery initiative (SPDI), a large-scale effort to identify novel human secreted and transmembrane proteins: a bioinformatics assessment. , 2003, Genome research.

[24]  S. Brunak,et al.  SHORT COMMUNICATION Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites , 1997 .

[25]  Evelyn Camon,et al.  The EMBL Nucleotide Sequence Database , 2004, Nucleic acids research.

[26]  Hirokazu Matsumoto,et al.  A prolactin-releasing peptide in the brain , 1998, Nature.

[27]  A. Krogh,et al.  Prediction of lipoprotein signal peptides in Gram‐negative bacteria , 2003, Protein science : a publication of the Protein Society.

[28]  S. Brunak,et al.  Defining a similarity threshold for a functional protein sequence pattern: The signal peptide cleavage site , 1996, Proteins.

[29]  R. Ye,et al.  Mammalian protein secretion without signal peptide removal. Biosynthesis of plasminogen activator inhibitor-2 in U-937 cells. , 1988, The Journal of biological chemistry.

[30]  K Nishikawa,et al.  Discrimination of intracellular and extracellular proteins using amino acid composition and residue-pair frequencies. , 1994, Journal of molecular biology.

[31]  Rolf Apweiler,et al.  A comparison of signal sequence prediction methods using a test set of signal peptides , 2000, Bioinform..