Prediction of lipoprotein signal peptides in Gram-positive bacteria with a Hidden Markov Model.

We present a Hidden Markov Model method for the prediction of lipoprotein signal peptides of Gram-positive bacteria, trained on a set of 67 experimentally verified lipoproteins. The method outperforms LipoP and the methods based on regular expression patterns, in various data sets containing experimentally characterized lipoproteins, secretory proteins, proteins with an N-terminal TM segment and cytoplasmic proteins. The method is also very sensitive and specific in the detection of secretory signal peptides and in terms of overall accuracy outperforms even SignalP, which is the top-scoring method for the prediction of signal peptides. PRED-LIPO is freely available at http://bioinformatics.biol.uoa.gr/PRED-LIPO/, and we anticipate that it will be a valuable tool for the experimentalists studying secreted proteins and lipoproteins from Gram-positive bacteria.

[1]  A. Krogh,et al.  Reliability measures for membrane protein topology prediction algorithms. , 2003, Journal of molecular biology.

[2]  Ian Collinson,et al.  Structure and function of the bacterial Sec translocon. , 2007, Molecular membrane biology.

[3]  Ian Collinson,et al.  Structure and function of the bacterial Sec translocon (Review) , 2007, Molecular membrane biology.

[4]  D. Dubnau,et al.  ComEA, a Bacillus subtilis integral membrane protein required for genetic transformation, is needed for both DNA binding and transport , 1995, Journal of bacteriology.

[5]  R. Herrmann,et al.  The Subunit b of the F0F1-type ATPase of the Bacterium Mycoplasma pneumoniae Is a Lipoprotein* , 1998, The Journal of Biological Chemistry.

[6]  Anders Krogh,et al.  Prediction of Signal Peptides and Signal Anchors by a Hidden Markov Model , 1998, ISMB.

[7]  Henry C. Wu,et al.  Lipoproteins in bacteria , 1990, Journal of bioenergetics and biomembranes.

[8]  H. C. Wu,et al.  Lipid modification of bacterial prolipoprotein. Transfer of diacylglyceryl moiety from phosphatidylglycerol. , 1994, The Journal of biological chemistry.

[9]  H. C. Wu,et al.  Bacterial prolipoprotein signal peptidase. , 1995, Methods in Enzymology.

[10]  Piero Fariselli,et al.  SPEPlip: the detection of signal peptide and lipoprotein cleavage sites , 2003, Bioinform..

[11]  Krishnan Sankaran,et al.  [49] Modification of bacterial lipoproteins , 1995 .

[12]  D. Faury,et al.  Secretion of active xylanase C from Streptomyces lividans is exclusively mediated by the Tat protein export system. , 2004, Biochimica et biophysica acta.

[13]  Amos Bairoch,et al.  The PROSITE database , 2005, Nucleic Acids Res..

[14]  J. Leigh,et al.  MtuA, a Lipoprotein Receptor Antigen from Streptococcus uberis, Is Responsible for Acquisition of Manganese during Growth in Milk and Is Essential for Infection of the Lactating Bovine Mammary Gland , 2003, Infection and Immunity.

[15]  Zemin Zhang,et al.  A profile hidden Markov model for signal peptides generated by HMMER , 2003, Bioinform..

[16]  C. Fontenelle,et al.  Purification and characterization of the major lipoprotein (P28) of Spiroplasma apis. , 2002, Protein expression and purification.

[17]  Renu Tuteja,et al.  Type I signal peptidase: an overview. , 2005, Archives of biochemistry and biophysics.

[18]  Guowen Liu,et al.  The lppC gene of Streptococcus equisimilis encodes a lipoprotein that is homologous to the e (P4) outer membrane protein from Haemophilus influenzae , 1997, Medical Microbiology and Immunology.

[19]  G. Crooks,et al.  WebLogo: a sequence logo generator. , 2004, Genome research.

[20]  Masami Ikeda,et al.  TMPDB: a database of experimentally-characterized transmembrane topologies , 2003, Nucleic Acids Res..

[21]  Uwe Völker,et al.  A comprehensive proteome map of growing Bacillus subtilis cells , 2004, Proteomics.

[22]  Antoine Danchin,et al.  SubtiList: the reference database for the Bacillus subtilis genome , 2002, Nucleic Acids Res..

[23]  S. Udaka,et al.  Conserved structures of cell wall protein genes among protein-producing Bacillus brevis strains , 1990, Journal of bacteriology.

[24]  A. Persson,et al.  Variable Surface Protein Vmm of Mycoplasma mycoides subsp. mycoides Small Colony Type , 2002, Journal of bacteriology.

[25]  Pedro Gonnet,et al.  Probabilistic alignment of motifs with sequences , 2002, Bioinform..

[26]  G. Jan,et al.  Biochemical and antigenic characterisation of Mycoplasma gallisepticum membrane proteins P52 and P67 (pMGA) , 2001, Archives of Microbiology.

[27]  J. Meens,et al.  Identification and immunological characterization of conserved Mycoplasma hyopneumoniae lipoproteins Mhp378 and Mhp651. , 2006, Veterinary microbiology.

[28]  Rolf Apweiler,et al.  A comparison of signal sequence prediction methods using a test set of signal peptides , 2000, Bioinform..

[29]  S H White,et al.  MPtopo: A database of membrane protein topology , 2001, Protein science : a publication of the Protein Society.

[30]  S. Brunak,et al.  SHORT COMMUNICATION Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites , 1997 .

[31]  G. von Heijne Analysis of the distribution of charged residues in the N-terminal region of signal sequences: implications for protein export in prokaryotic and eukaryotic cells. , 1984, The EMBO journal.

[32]  A. Krogh,et al.  Prediction of lipoprotein signal peptides in Gram‐negative bacteria , 2003, Protein science : a publication of the Protein Society.

[33]  G von Heijne,et al.  Signal sequences. The limits of variation. , 1985, Journal of molecular biology.

[34]  Frank Sargent,et al.  Protein targeting by the bacterial twin-arginine translocation (Tat) pathway. , 2005, Current opinion in microbiology.

[35]  S. Mizushima,et al.  Characterization of new membrane lipoproteins and their precursors of Escherichia coli. , 1981, The Journal of biological chemistry.

[36]  Gunnar von Heijne,et al.  The structure of signal peptides from bacterial lipoproteins. , 1989 .

[37]  Burkhard Rost,et al.  Long membrane helices and short loops predicted less accurately , 2002, Protein science : a publication of the Protein Society.

[38]  T. Noll,et al.  Systematic screening of all signal peptides from Bacillus subtilis: a powerful strategy in optimizing heterologous protein secretion in Gram-positive bacteria. , 2006, Journal of molecular biology.

[39]  Anders Krogh Hidden Markov models for labeled sequences , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3 - Conference C: Signal Processing (Cat. No.94CH3440-5).

[40]  Zemin Zhang,et al.  Signal peptide prediction based on analysis of experimentally verified cleavage sites , 2004, Protein science : a publication of the Protein Society.

[41]  Erik L. L. Sonnhammer,et al.  An HMM posterior decoder for sequence feature prediction that includes homology information , 2005, ISMB.

[42]  A. Krogh,et al.  Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. , 2001, Journal of molecular biology.

[43]  George Georgiou,et al.  The bacterial twin-arginine translocation pathway. , 2006, Annual review of microbiology.

[44]  Erik L. L. Sonnhammer,et al.  Advantages of combined transmembrane topology and signal peptide prediction—the Phobius web server , 2007, Nucleic Acids Res..

[45]  Jan Maarten van Dijl,et al.  A proteomic view on genome-based signal peptide predictions. , 2001, Genome research.

[46]  H. C. Wu,et al.  [12] Bacterial prolipoprotein signal peptidase , 1995 .

[47]  A. Krogh,et al.  A combined transmembrane topology and signal peptide prediction method. , 2004, Journal of molecular biology.

[48]  H Nielsen,et al.  Machine learning approaches for the prediction of signal peptides and other protein sorting signals. , 1999, Protein engineering.

[49]  W. Vosloo,et al.  Characterisation of a lipoprotein in Mycobacterium bovis (BCG) with sequence similarity to the secreted protein MPB70. , 1997, Gene.

[50]  K. Wise,et al.  Processing and surface presentation of the Mycoplasma hyorhinis variant lipoprotein VlpC , 1994, Journal of bacteriology.

[51]  Tracy Palmer,et al.  The twin-arginine translocation pathway is a major route of protein export in Streptomyces coelicolor , 2006, Proceedings of the National Academy of Sciences.

[52]  M. Madan Babu,et al.  DOLOP-database of bacterial lipoproteins , 2002, Bioinform..

[53]  G. Heijne A new method for predicting signal sequence cleavage sites. , 1986 .

[54]  Sean R. Eddy,et al.  Profile hidden Markov models , 1998, Bioinform..

[55]  G. Heijne The signal peptide , 2005, The Journal of Membrane Biology.

[56]  A. Camper,et al.  Identification of Staphylococcus aureus Proteins Recognized by the Antibody-Mediated Immune Response to a Biofilm Infection , 2006, Infection and Immunity.

[57]  Stavros J. Hamodrakas,et al.  Algorithms for incorporating prior topological information in HMMs: application to transmembrane proteins , 2006, BMC Bioinformatics.

[58]  S. Brunak,et al.  Improved prediction of signal peptides: SignalP 3.0. , 2004, Journal of molecular biology.

[59]  Oscar P. Kuipers,et al.  Proteomics of Protein Secretion by Bacillus subtilis: Separating the “Secrets” of the Secretome , 2004, Microbiology and Molecular Biology Reviews.

[60]  M. Buttner,et al.  The σE Cell Envelope Stress Response of Streptomyces coelicolor Is Influenced by a Novel Lipoprotein, CseA , 2006, Journal of bacteriology.

[61]  A. Conti,et al.  P48 Major Surface Antigen of Mycoplasma agalactiae Is Homologous to a malp Product of Mycoplasma fermentans and Belongs to a Selected Family of Bacterial Lipoproteins , 1999, Infection and Immunity.

[62]  I. Sutcliffe,et al.  Lipoproteins of gram-positive bacteria , 1995, Journal of bacteriology.

[63]  Dieter Jahn,et al.  PrediSi: prediction of signal peptides and their cleavage positions , 2004, Nucleic Acids Res..

[64]  Sierd Bron,et al.  Type I signal peptidases of Gram-positive bacteria. , 2004, Biochimica et biophysica acta.

[65]  S. Mizushima,et al.  Mechanism of signal peptide cleavage in the biosynthesis of the major lipoprotein of the Escherichia coli outer membrane. , 1982, The Journal of biological chemistry.

[66]  Amos Bairoch,et al.  A Generalized Profile Syntax for Biomolecular Sequence Motifs and its Function in Automatic Sequence Interpretation , 1994, ISMB.

[67]  G von Heijne,et al.  The structure of signal peptides from bacterial lipoproteins. , 1989, Protein engineering.

[68]  R. S. Rao,et al.  Reliability measures for buried flexible pipes , 2005 .

[69]  K. Bunai,et al.  Profiling and comprehensive expression analysis of ABC transporter solute‐binding proteins of Bacillus subtilis membrane based on a proteomic approach , 2004, Electrophoresis.

[70]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[71]  Cathy H. Wu,et al.  The Universal Protein Resource (UniProt): an expanding universe of protein information , 2005, Nucleic Acids Res..

[72]  J. V. van Dijl,et al.  Proteomics‐based consensus prediction of protein retention in a bacterial membrane , 2005, Proteomics.

[73]  Stavros J. Hamodrakas,et al.  Prediction of Cell Wall Sorting Signals in Gram-Positive bacteria with a Hidden Markov Model: Application to Complete genomes , 2008, J. Bioinform. Comput. Biol..

[74]  Søren Brunak,et al.  Prediction of twin-arginine signal peptides , 2005, BMC Bioinformatics.

[75]  Iain C Sutcliffe,et al.  Pattern searches for the identification of putative lipoprotein genes in Gram-positive bacterial genomes. , 2002, Microbiology.

[76]  J. Setubal,et al.  Lipoprotein computational prediction in spirochaetal genomes. , 2006, Microbiology.

[77]  D. Klionsky,et al.  How to get a folded protein across a membrane. , 1999, Trends in cell biology.

[78]  T. D. Schneider,et al.  Sequence logos: a new way to display consensus sequences. , 1990, Nucleic acids research.

[79]  Gunnar von Heijne,et al.  Patterns of Amino Acids near Signal‐Sequence Cleavage Sites , 1983 .

[80]  Amos Bairoch,et al.  ScanProsite: detection of PROSITE signature matches and ProRule-associated functional and structural residues in proteins , 2006, Nucleic Acids Res..

[81]  Rolf Apweiler,et al.  A collection of well characterised integral membrane proteins , 2000, Bioinform..

[82]  Pierre Baldi,et al.  Assessing the accuracy of prediction algorithms for classification: an overview , 2000, Bioinform..

[83]  G. Heijne Analysis of the distribution of charged residues in the N‐terminal region of signal sequences: implications for protein export in prokaryotic and eukaryotic cells. , 1984, The EMBO journal.

[84]  S D Gupta,et al.  Modification of bacterial lipoproteins. , 1995, Methods in enzymology.

[85]  J. Dengjel,et al.  Staphylococcus aureus Deficient in Lipidation of Prelipoproteins Is Attenuated in Growth and Immune Activation , 2005, Infection and Immunity.

[86]  M. Madan Babu,et al.  A Database of Bacterial Lipoproteins (DOLOP) with Functional Assignments to Predicted Lipoproteins , 2006, Journal of bacteriology.

[87]  Piero Fariselli,et al.  A new decoding algorithm for hidden Markov models improves the prediction of the topology of all-beta membrane proteins , 2005, BMC Bioinformatics.

[88]  A. Coelho,et al.  The [NiFeSe] hydrogenase from Desulfovibrio vulgaris Hildenborough is a bacterial lipoprotein lacking a typical lipoprotein signal peptide , 2007, FEBS letters.

[89]  A. Driessen,et al.  Protein translocation across the bacterial cytoplasmic membrane. , 2008, Annual review of biochemistry.

[90]  Pedro Gonnet,et al.  Fine‐tuning the prediction of sequences cleaved by signal peptidase II: A curated set of proven and predicted lipoproteins of Escherichia coli K‐12 , 2004, Proteomics.

[91]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[92]  Mechthild Pohlschröder,et al.  Haloferax volcanii twin‐arginine translocation substates include secreted soluble, C‐terminally anchored and lipoproteins , 2007, Molecular microbiology.

[93]  C. Griot,et al.  Characterization of the gene for an immunodominant 72 kDa lipoprotein of Mycoplasma mycoides subsp. mycoides small colony type. , 1996, Microbiology.

[94]  Martin Ester,et al.  Sequence analysis PSORTb v . 2 . 0 : Expanded prediction of bacterial protein subcellular localization and insights gained from comparative proteome analysis , 2004 .