Algorithms for incorporating prior topological information in HMMs: application to transmembrane proteins

BackgroundHidden Markov Models (HMMs) have been extensively used in computational molecular biology, for modelling protein and nucleic acid sequences. In many applications, such as transmembrane protein topology prediction, the incorporation of limited amount of information regarding the topology, arising from biochemical experiments, has been proved a very useful strategy that increased remarkably the performance of even the top-scoring methods. However, no clear and formal explanation of the algorithms that retains the probabilistic interpretation of the models has been presented so far in the literature.ResultsWe present here, a simple method that allows incorporation of prior topological information concerning the sequences at hand, while at the same time the HMMs retain their full probabilistic interpretation in terms of conditional probabilities. We present modifications to the standard Forward and Backward algorithms of HMMs and we also show explicitly, how reliable predictions may arise by these modifications, using all the algorithms currently available for decoding HMMs. A similar procedure may be used in the training procedure, aiming at optimizing the labels of the HMM's classes, especially in cases such as transmembrane proteins where the labels of the membrane-spanning segments are inherently misplaced. We present an application of this approach developing a method to predict the transmembrane regions of alpha-helical membrane proteins, trained on crystallographically solved data. We show that this method compares well against already established algorithms presented in the literature, and it is extremely useful in practical applications.ConclusionThe algorithms presented here, are easily implemented in any kind of a Hidden Markov Model, whereas the prediction method (HMM-TM) is freely available for academic users at http://bioinformatics.biol.uoa.gr/HMM-TM, offering the most advanced decoding options currently available.

[1]  Stavros J. Hamodrakas,et al.  Evaluation of methods for predicting the topology of β-barrel outer membrane proteins and a consensus prediction method , 2005, BMC Bioinformatics.

[2]  A. Krogh,et al.  Reliability measures for membrane protein topology prediction algorithms. , 2003, Journal of molecular biology.

[3]  Piero Fariselli,et al.  MaxSubSeq: an algorithm for segment-length optimization. The case study of the transmembrane spanning segments , 2003, Bioinform..

[4]  Anders Krogh,et al.  Two Methods for Improving Performance of a HMM and their Application for Gene Finding , 1997, ISMB.

[5]  G. von Heijne,et al.  Materials and Methods Figs. S1 to S3 References and Notes Global Topology Analysis of the Escherichia Coli Inner Membrane Proteome , 2022 .

[6]  Johan Nilsson,et al.  Rapid topology mapping of Escherichia coli inner-membrane proteins by prediction and PhoA/GFP fusion analysis , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[7]  A. Krogh,et al.  Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. , 2001, Journal of molecular biology.

[8]  Andreas Bernsel,et al.  Improved membrane protein topology prediction by domain assignments , 2005, Protein science : a publication of the Protein Society.

[9]  G. Tusnády,et al.  Principles governing amino acid composition of integral membrane proteins: application to topology prediction. , 1998, Journal of molecular biology.

[10]  Piero Fariselli,et al.  A new decoding algorithm for hidden Markov models improves the prediction of the topology of all-beta membrane proteins , 2005, BMC Bioinformatics.

[11]  Sean R. Eddy,et al.  Profile hidden Markov models , 1998, Bioinform..

[12]  G von Heijne,et al.  Consensus predictions of membrane protein topology , 2000, FEBS letters.

[13]  A. Krogh,et al.  A combined transmembrane topology and signal peptide prediction method. , 2004, Journal of molecular biology.

[14]  Y. Zhang,et al.  β‐lactamase as a probe of membrane protein assembly and protein export , 1990, Molecular microbiology.

[15]  István Simon,et al.  The HMMTOP transmembrane topology prediction server , 2001, Bioinform..

[16]  Masami Ikeda,et al.  The presence of signal peptide significantly affects transmembrane topology prediction , 2002, Bioinform..

[17]  D. Haussler,et al.  A hidden Markov model that finds genes in E. coli DNA. , 1994, Nucleic acids research.

[18]  Satoshi Murakami,et al.  Crystal structure of bacterial multidrug efflux transporter AcrB , 2002, Nature.

[19]  C. Manoil,et al.  Analysis of membrane protein topology using alkaline phosphatase and beta-galactosidase gene fusions. , 1991, Methods in cell biology.

[20]  George Georgiou,et al.  A periplasmic fluorescent reporter protein and its application in high-throughput membrane protein topology analysis. , 2004, Journal of molecular biology.

[21]  Stavros J. Hamodrakas,et al.  PRED-TMBB: a web server for predicting the topology of ?barrel outer membrane proteins , 2004, Nucleic Acids Res..

[22]  Yaoqi Zhou,et al.  Predicting the topology of transmembrane helical proteins using mean burial propensity and a hidden-Markov-model-based method , 2003 .

[23]  Burkhard Rost,et al.  Refining Neural Network Predictions for Helical Transmembrane Proteins by Dynamic Programming , 1996, ISMB.

[24]  Johan Nilsson,et al.  Experimentally based topology models for E. coli inner membrane proteins , 2004, Protein science : a publication of the Protein Society.

[25]  Rolf Apweiler,et al.  Evaluation of methods for the prediction of membrane spanning regions , 2001, Bioinform..

[26]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[27]  J. Lolkema,et al.  Membrane Topology and Insertion of Membrane Proteins: Search for Topogenic Signals , 2000, Microbiology and Molecular Biology Reviews.

[28]  W R Taylor,et al.  A model recognition approach to the prediction of all-helical membrane protein structure and topology. , 1994, Biochemistry.

[29]  Pierre Baldi,et al.  Assessing the accuracy of prediction algorithms for classification: an overview , 2000, Bioinform..

[30]  A. Krogh,et al.  Prediction of lipoprotein signal peptides in Gram‐negative bacteria , 2003, Protein science : a publication of the Protein Society.

[31]  D. Clarke,et al.  Determining the structure and mechanism of the human multidrug resistance P-glycoprotein using cysteine-scanning mutagenesis and thiol-modification techniques. , 1999, Biochimica et biophysica acta.

[32]  Vasilis J. Promponas,et al.  CoPreTHi: A Web tool which combines transmembrane protein segment prediction methods , 2009, Silico Biol..

[33]  A. Elofsson,et al.  Best α‐helical transmembrane protein topology predictions are achieved using hidden Markov models and evolutionary information , 2004 .

[34]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[35]  R. Schwartz,et al.  The N-best algorithms: an efficient and exact procedure for finding the N most likely sentence hypotheses , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[36]  B M Conti-Fine,et al.  Antibodies as tools to study the structure of membrane proteins: the case of the nicotinic acetylcholine receptor. , 1996, Annual review of biophysics and biomolecular structure.

[37]  Stavros J. Hamodrakas,et al.  A Hidden Markov Model method, capable of predicting and discriminating β-barrel outer membrane proteins , 2004, BMC Bioinformatics.

[38]  K. Bennett,et al.  Probing protein surface topology by chemical surface labeling, crosslinking, and mass spectrometry. , 2000, Methods in molecular biology.

[39]  Anders Krogh,et al.  Prediction of Signal Peptides and Signal Anchors by a Hidden Markov Model , 1998, ISMB.

[40]  Manuel G. Claros,et al.  TopPred II: an improved software for membrane protein structure predictions , 1994, Comput. Appl. Biosci..

[41]  Geoffrey J. Barton,et al.  JPred : a consensus secondary structure prediction server , 1999 .

[42]  Satoru Hayamizu,et al.  Prediction of protein secondary structure by the hidden Markov model , 1993, Comput. Appl. Biosci..

[43]  Masami Ikeda,et al.  ConPred II: a consensus prediction method for obtaining transmembrane topology models with high reliability , 2004, Nucleic Acids Res..

[44]  Gunnar von Heijne,et al.  Topology Models for 37 Saccharomyces cerevisiaeMembrane Proteins Based on C-terminal Reporter Fusions and Predictions* , 2003, The Journal of Biological Chemistry.

[45]  Erik L. L. Sonnhammer,et al.  An HMM posterior decoder for sequence feature prediction that includes homology information , 2005, ISMB.

[46]  B. Rost,et al.  A modified definition of Sov, a segment‐based measure for protein secondary structure prediction assessment , 1999, Proteins.

[47]  Sean R. Eddy,et al.  Multiple Alignment Using Hidden Markov Models , 1995, ISMB.

[48]  Y. Shiio,et al.  Epitope tagging. , 1995, Methods in enzymology.

[49]  Piero Fariselli,et al.  An ENSEMBLE machine learning approach for the prediction of all-alpha membrane proteins , 2003, ISMB.

[50]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[51]  C. Manoil,et al.  Chapter 3 Analysis of Membrane Protein Topology Using Alkaline Phosphatase and β-Galactosidase Gene Fusions , 1991 .

[52]  B. Rost,et al.  Topology prediction for helical transmembrane proteins at 86% accuracy–Topology prediction at 86% accuracy , 1996, Protein science : a publication of the Protein Society.

[53]  Jon Beckwith,et al.  The topological analysis of integral cytoplasmic membrane proteins , 1993, The Journal of Membrane Biology.

[54]  A. Yamaguchi,et al.  Membrane topology of a multidrug efflux transporter, AcrB, in Escherichia coli. , 2002, Journal of biochemistry.

[55]  W. Jim Zheng,et al.  A hidden Markov model with molecular mechanics energy-scoring function for transmembrane helix prediction , 2004, Comput. Biol. Chem..