Integrative approach for detecting membrane proteins

Background Membrane proteins are key gates that control various vital cellular functions. Membrane proteins are often detected using transmembrane topology prediction tools. While transmembrane topology prediction tools can detect integral membrane proteins, they do not address surface-bound proteins. In this study, we focused on finding the best techniques for distinguishing all types of membrane proteins. Results This research first demonstrates the shortcomings of merely using transmembrane topology prediction tools to detect all types of membrane proteins. Then, the performance of various feature extraction techniques in combination with different machine learning algorithms was explored. The experimental results obtained by cross-validation and independent testing suggest that applying an integrative approach that combines the results of transmembrane topology prediction and position-specific scoring matrix (Pse-PSSM) optimized evidence-theoretic k nearest neighbor (OET-KNN) predictors yields the best performance. Conclusion The integrative approach outperforms the state-of-the-art methods in terms of accuracy and MCC, where the accuracy reached a 92.51% in independent testing, compared to the 89.53% and 79.42% accuracies achieved by the state-of-the-art methods.

[1]  David T. Jones,et al.  Transmembrane protein topology prediction using support vector machines , 2009, BMC Bioinformatics.

[2]  A. Krogh,et al.  Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. , 2001, Journal of molecular biology.

[3]  Jing Hu,et al.  A method for discovering transmembrane beta-barrel proteins in Gram-negative bacterial proteomes , 2008, Comput. Biol. Chem..

[4]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[5]  Stavros J. Hamodrakas,et al.  Evaluation of methods for predicting the topology of β-barrel outer membrane proteins and a consensus prediction method , 2005, BMC Bioinformatics.

[6]  David R. Westhead,et al.  TMB-Hunt: An amino acid composition based method to screen proteomes for beta-barrel transmembrane proteins , 2005, BMC Bioinformatics.

[7]  Volkhard Helms,et al.  TMBHMM: a frequency profile based HMM for predicting the topology of transmembrane beta barrel proteins and the exposure status of transmembrane residues. , 2011, Biochimica et biophysica acta.

[8]  Anders Krogh,et al.  Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[9]  David W. Opitz,et al.  Generating Accurate and Diverse Members of a Neural-Network Ensemble , 1995, NIPS.

[10]  Zahoor Jan,et al.  iMem-2LSAAC: A two-level model for discrimination of membrane proteins and their types by extending the notion of SAAC into chou's pseudo amino acid composition. , 2018, Journal of theoretical biology.

[11]  Thierry Denoeux,et al.  A k-nearest neighbor classification rule based on Dempster-Shafer theory , 1995, IEEE Trans. Syst. Man Cybern..

[12]  Arne Elofsson,et al.  BOCTOPUS: improved topology prediction of transmembrane β barrel proteins , 2012, Bioinform..

[13]  C. Tanford Contribution of Hydrophobic Interactions to the Stability of the Globular Conformation of Proteins , 1962 .

[14]  Ahmad Hassan Butt,et al.  A Treatise to Computational Approaches Towards Prediction of Membrane Protein and Its Subtypes , 2016, The Journal of Membrane Biology.

[15]  Asifullah Khan,et al.  MemHyb: predicting membrane protein types by hybridizing SAAC and PSSM. , 2012, Journal of theoretical biology.

[16]  G. von Heijne,et al.  Prediction of membrane-protein topology from first principles , 2008, Proceedings of the National Academy of Sciences.

[17]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Yu-Yen Ou,et al.  TMBETADISC-RBF: Discrimination of beta-barrel membrane proteins using RBF networks and PSSM profiles , 2008, Comput. Biol. Chem..

[19]  Arne Elofsson,et al.  OCTOPUS: improving topology prediction by two-track ANN-based preference scores and an extended topological grammar , 2008, Bioinform..

[20]  Arne Elofsson,et al.  PRED-TMBB2: improved topology prediction and detection of beta-barrel outer membrane proteins , 2016, Bioinform..

[21]  Kuo-Chen Chou,et al.  MemType-2L: a web server for predicting membrane proteins and their types by incorporating evolution information through Pse-PSSM. , 2007, Biochemical and biophysical research communications.

[22]  Johannes Söding,et al.  HHomp—prediction and classification of outer membrane proteins , 2009, Nucleic Acids Res..

[23]  Stavros J. Hamodrakas,et al.  PRED-TMBB: a web server for predicting the topology of ?barrel outer membrane proteins , 2004, Nucleic Acids Res..

[24]  Marcin J. Skwark,et al.  Sequence analysis SPOCTOPUS: a combined predictor of signal peptides and membrane protein topology , 2008 .

[25]  Jeff A. Bilmes,et al.  Transmembrane Topology and Signal Peptide Prediction Using Dynamic Bayesian Networks , 2008, PLoS Comput. Biol..

[26]  Yu-Yen Ou,et al.  Prediction of membrane spanning segments and topology in β‐barrel membrane proteins at better accuracy , 2010, J. Comput. Chem..

[27]  Sher Afzal Khan,et al.  A Prediction Model for Membrane Proteins Using Moments Based Features , 2016, BioMed research international.

[28]  Arne Elofsson,et al.  Topology of membrane proteins-predictions, limitations and variations. , 2018, Current opinion in structural biology.

[29]  K. Chou Prediction of protein cellular attributes using pseudo‐amino acid composition , 2001, Proteins.

[30]  Hao Lin The modified Mahalanobis Discriminant for predicting outer membrane proteins by using Chou's pseudo amino acid composition. , 2008, Journal of theoretical biology.

[31]  Adam Godzik,et al.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences , 2006, Bioinform..

[32]  Arne Elofsson,et al.  The TOPCONS web server for consensus prediction of membrane protein topology and signal peptides , 2015, Nucleic Acids Res..

[33]  K. Chou,et al.  Analysis and Prediction of the Metabolic Stability of Proteins Based on Their Sequential Features, Subcellular Locations and Interaction Networks , 2010, PloS one.

[34]  R. Doolittle,et al.  A simple method for displaying the hydropathic character of a protein. , 1982, Journal of molecular biology.

[35]  R. Casadio,et al.  Prediction of the transmembrane regions of β‐barrel membrane proteins with a neural network‐based predictor , 2001, Protein science : a publication of the Protein Society.

[36]  Max Kuhn,et al.  Building Predictive Models in R Using the caret Package , 2008 .

[37]  Ingvar Eidhammer,et al.  BOMP: a program to predict integral ?barrel outer membrane proteins encoded within genomes of Gram-negative bacteria , 2004, Nucleic Acids Res..

[38]  Konstantinos D. Tsirigos,et al.  A guideline to proteome‐wide α‐helical membrane protein topology predictions , 2012, Proteomics.

[39]  K. R. Woods,et al.  Prediction of protein antigenic determinants from amino acid sequences. , 1981, Proceedings of the National Academy of Sciences of the United States of America.

[40]  A. Barabasi,et al.  Drug—target network , 2007, Nature Biotechnology.

[41]  Erik L. L. Sonnhammer,et al.  An HMM posterior decoder for sequence feature prediction that includes homology information , 2005, ISMB.

[42]  István Simon,et al.  The HMMTOP transmembrane topology prediction server , 2001, Bioinform..