A sequence-profile-based HMM for predicting and discriminating beta barrel membrane proteins

MOTIVATION Membrane proteins are an abundant and functionally relevant subset of proteins that putatively include from about 15 up to 30% of the proteome of organisms fully sequenced. These estimates are mainly computed on the basis of sequence comparison and membrane protein prediction. It is therefore urgent to develop methods capable of selecting membrane proteins especially in the case of outer membrane proteins, barely taken into consideration when proteome wide analysis is performed. This will also help protein annotation when no homologous sequence is found in the database. Outer membrane proteins solved so far at atomic resolution interact with the external membrane of bacteria with a characteristic beta barrel structure comprising different even numbers of beta strands (beta barrel membrane proteins). In this they differ from the membrane proteins of the cytoplasmic membrane endowed with alpha helix bundles (all alpha membrane proteins) and need specialised predictors. RESULTS We develop a HMM model, which can predict the topology of beta barrel membrane proteins using, as input, evolutionary information. The model is cyclic with 6 types of states: two for the beta strand transmembrane core, one for the beta strand cap on either side of the membrane, one for the inner loop, one for the outer loop and one for the globular domain state in the middle of each loop. The development of a specific input for HMM based on multiple sequence alignment is novel. The accuracy per residue of the model is 83% when a jack knife procedure is adopted. With a model optimisation method using a dynamic programming algorithm seven topological models out of the twelve proteins included in the testing set are also correctly predicted. When used as a discriminator, the model is rather selective. At a fixed probability value, it retains 84% of a non-redundant set comprising 145 sequences of well-annotated outer membrane proteins. Concomitantly, it correctly rejects 90% of a set of globular proteins including about 1200 chains with low sequence identity (<30%) and 90% of a set of all alpha membrane proteins, including 188 chains.

[1]  G. Schulz,et al.  Refined structure of the porin from Rhodopseudomonas blastica. Comparison with the porin from Rhodobacter capsulatus. , 1994, Journal of molecular biology.

[2]  Colin Hughes,et al.  Crystal structure of the bacterial membrane protein TolC central to multidrug efflux and protein export , 2000, Nature.

[3]  G. Schulz,et al.  Structure of porin refined at 1.8 A resolution. , 1992, Journal of molecular biology.

[4]  J. Jenkins,et al.  The structure of OmpF porin in a tetragonal crystal form. , 1995, Structure.

[5]  Kay Diederichs,et al.  Structure of the sucrose-specific porin ScrY from Salmonella typhimurium and its complex with sucrose , 1998, Nature Structural Biology.

[6]  K. Diederichs,et al.  Crystal structure of Omp32, the anion-selective porin from Comamonas acidovorans, in complex with a periplasmic peptide at 2.1 A resolution. , 2000, Structure.

[7]  R. Casadio,et al.  Prediction of the transmembrane regions of β‐barrel membrane proteins with a neural network‐based predictor , 2001, Protein science : a publication of the Protein Society.

[8]  K. Diederichs,et al.  Siderophore-mediated iron transport: crystal structure of FhuA with bound lipopolysaccharide. , 1998, Science.

[9]  B. Rost,et al.  Topology prediction for helical transmembrane proteins at 86% accuracy–Topology prediction at 86% accuracy , 1996, Protein science : a publication of the Protein Society.

[10]  B. Rost,et al.  Transmembrane helices predicted at 95% accuracy , 1995, Protein science : a publication of the Protein Society.

[11]  B. Rost,et al.  A modified definition of Sov, a segment‐based measure for protein secondary structure prediction assessment , 1999, Proteins.

[12]  G. Schulz β-Barrel membrane proteins , 2000 .

[13]  J. Deisenhofer,et al.  Crystal structure of the outer membrane active transporter FepA from Escherichia coli , 1999, Nature Structural Biology.

[14]  G. Schulz,et al.  Structure of maltoporin from Salmonella typhimurium ligated with a nitrophenyl-maltotrioside. , 1997, Journal of molecular biology.

[15]  Hur-Li Lee,et al.  What is a collection? , 2000, J. Am. Soc. Inf. Sci..

[16]  A. Krogh,et al.  Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. , 2001, Journal of molecular biology.

[17]  W R Taylor,et al.  A model recognition approach to the prediction of all-helical membrane protein structure and topology. , 1994, Biochemistry.

[18]  G. Schulz,et al.  The structure of the outer membrane protein OmpX from Escherichia coli reveals possible mechanisms of virulence. , 1999, Structure.

[19]  Hilla Peretz,et al.  The , 1966 .

[20]  G. Schulz,et al.  Structure of the outer membrane protein A transmembrane domain , 1998, Nature Structural Biology.

[21]  K. H. Kalk,et al.  Structural evidence for dimerization-regulated activation of an integral membrane phospholipase. , 1999 .

[22]  Anders Krogh Hidden Markov models for labeled sequences , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3 - Conference C: Signal Processing (Cat. No.94CH3440-5).

[23]  G. Tusnády,et al.  Principles governing amino acid composition of integral membrane proteins: application to topology prediction. , 1998, Journal of molecular biology.

[24]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[25]  Rolf Apweiler,et al.  A collection of well characterised integral membrane proteins , 2000, Bioinform..

[26]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.