Evaluation of transmembrane helix predictions in 2014

Experimental structure determination continues to be challenging for membrane proteins. Computational prediction methods are therefore needed and widely used to supplement experimental data. Here, we re‐examined the state of the art in transmembrane helix prediction based on a nonredundant dataset with 190 high‐resolution structures. Analyzing 12 widely‐used and well‐known methods using a stringent performance measure, we largely confirmed the expected high level of performance. On the other hand, all methods performed worse for proteins that could not have been used for development. A few results stood out: First, all methods predicted proteins in eukaryotes better than those in bacteria. Second, methods worked less well for proteins with many transmembrane helices. Third, most methods correctly discriminated between soluble and transmembrane proteins. However, several older methods often mistook signal peptides for transmembrane helices. Some newer methods have overcome this shortcoming. In our hands, PolyPhobius and MEMSAT‐SVM outperformed other methods. Proteins 2015; 83:473–484. © 2014 Wiley Periodicals, Inc.

[1]  Avner Schlessinger,et al.  PredictProtein—an open resource for online prediction of protein structural and functional features , 2014, Nucleic Acids Res..

[2]  Kalle Jonasson,et al.  Prediction of the human membrane proteome , 2010, Proteomics.

[3]  Edgar Jacoby,et al.  The 7 TM G‐Protein‐Coupled Receptor Target Family , 2006, ChemMedChem.

[4]  Jaime Prilusky,et al.  Interplay between hydrophobicity and the positive-inside rule in determining membrane-protein topology , 2016, Proceedings of the National Academy of Sciences.

[5]  G. von Heijne,et al.  Prediction of membrane-protein topology from first principles , 2008, Proceedings of the National Academy of Sciences.

[6]  B. Rost,et al.  TMSEG: Novel prediction of transmembrane helices , 2016, Proteins.

[7]  David E. Gloriam,et al.  Comprehensive repertoire and phylogenetic analysis of the G protein-coupled receptors in human and mouse. , 2006, Genomics.

[8]  A. Krogh,et al.  A combined transmembrane topology and signal peptide prediction method. , 2004, Journal of molecular biology.

[9]  S. White The progress of membrane protein structure determination , 2004, Protein science : a publication of the Protein Society.

[10]  Marco Punta,et al.  Structural genomics plucks high-hanging membrane proteins. , 2012, Current opinion in structural biology.

[11]  Hyeon Joo,et al.  OPM database and PPM web server: resources for positioning of proteins in membranes , 2011, Nucleic Acids Res..

[12]  Burkhard Rost,et al.  Refining Neural Network Predictions for Helical Transmembrane Proteins by Dynamic Programming , 1996, ISMB.

[13]  Thomas A. Hopf,et al.  Three-Dimensional Structures of Membrane Proteins from Genomic Sequencing , 2012, Cell.

[14]  C. Peters,et al.  Topology Prediction of α-Helical Transmembrane Proteins , 2016 .

[15]  Burkhard Rost,et al.  UniqueProt: creating representative protein sequence sets , 2003, Nucleic Acids Res..

[16]  Shigeki Mitaku,et al.  Amphiphilicity index of polar amino acids as an aid in the characterization of amino acid preference at membrane-water interfaces , 2002, Bioinform..

[17]  A. Kernytsky,et al.  Transmembrane helix predictions revisited , 2002, Protein science : a publication of the Protein Society.

[18]  John P. Overington,et al.  How many drug targets are there? , 2006, Nature Reviews Drug Discovery.

[19]  Masami Ikeda,et al.  Transmembrane topology prediction methods: A re-assessment and improvement by a consensus method using a dataset of experimentally-characterized transmembrane topology , 2001, Silico Biol..

[20]  B. Rost,et al.  Transmembrane helices predicted at 95% accuracy , 1995, Protein science : a publication of the Protein Society.

[21]  Marcin J. Skwark,et al.  Sequence analysis SPOCTOPUS: a combined predictor of signal peptides and membrane protein topology , 2008 .

[22]  B. Dobberstein,et al.  Transfer to proteins across membranes. II. Reconstitution of functional rough microsomes from heterologous components , 1975, The Journal of cell biology.

[23]  David W. Burden,et al.  Introduction to Proteins , 1995 .

[24]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[25]  Marco Punta,et al.  Membrane protein prediction methods. , 2007, Methods.

[26]  Anders Krogh,et al.  Prediction of Signal Peptides and Signal Anchors by a Hidden Markov Model , 1998, ISMB.

[27]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[28]  Manuel G. Claros,et al.  TopPred II: an improved software for membrane protein structure predictions , 1994, Comput. Appl. Biosci..

[29]  B. Efron Nonparametric estimates of standard error: The jackknife, the bootstrap and other methods , 1981 .

[30]  G. von Heijne,et al.  Membrane insertion of marginally hydrophobic transmembrane helices depends on sequence context. , 2010, Journal of molecular biology.

[31]  Shigeki Mitaku,et al.  SOSUI: classification and secondary structure prediction system for membrane proteins , 1998, Bioinform..

[32]  D. Baker,et al.  Multipass membrane protein structure prediction using Rosetta , 2005, Proteins.

[33]  D. Doyle,et al.  Transmembrane helix prediction: a comparative evaluation and analysis. , 2005, Protein engineering, design & selection : PEDS.

[34]  Frank Alber,et al.  Integrating diverse data for structure determination of macromolecular assemblies. , 2008, Annual review of biochemistry.

[35]  B. Rost,et al.  Topology prediction for helical transmembrane proteins at 86% accuracy–Topology prediction at 86% accuracy , 1996, Protein science : a publication of the Protein Society.

[36]  G. Vonheijne The signal peptide. , 1990 .

[37]  Zsuzsanna Dosztányi,et al.  PDB_TM: selection and membrane localization of transmembrane proteins in the protein data bank , 2004, Nucleic Acids Res..

[38]  Alessandro Vullo,et al.  Distill: a suite of web servers for the prediction of one-, two- and three-dimensional structural features of proteins , 2006, BMC Bioinformatics.

[39]  E. Sonnhammer,et al.  Kalign2: high-performance multiple alignment of protein and nucleotide sequences allowing external features , 2008, Nucleic acids research.

[40]  B. Rost,et al.  Comparing function and structure between entire proteomes , 2001, Protein science : a publication of the Protein Society.

[41]  Avner Schlessinger,et al.  Coordinating the impact of structural genomics on the human α-helical transmembrane proteome , 2013, Nature Structural &Molecular Biology.

[42]  N. Grassly,et al.  Mathematical models of infectious disease transmission , 2008, Nature Reviews Microbiology.

[43]  David T. Jones,et al.  Transmembrane protein topology prediction using support vector machines , 2009, BMC Bioinformatics.

[44]  Arne Elofsson,et al.  OCTOPUS: improving topology prediction by two-track ANN-based preference scores and an extended topological grammar , 2008, Bioinform..

[45]  David T. Jones,et al.  Improving the accuracy of transmembrane protein topology prediction using evolutionary information , 2007, Bioinform..

[46]  R. Doolittle,et al.  A simple method for displaying the hydropathic character of a protein. , 1982, Journal of molecular biology.

[47]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[48]  István Simon,et al.  The HMMTOP transmembrane topology prediction server , 2001, Bioinform..

[49]  The UniProt Consortium,et al.  Reorganizing the protein space at the Universal Protein Resource (UniProt) , 2011, Nucleic Acids Res..

[50]  N. Schenker,et al.  Overlapping confidence intervals or standard error intervals: What do they mean in terms of statistical significance? , 2003, Journal of insect science.

[51]  Richard A Friesner,et al.  An Automatic Method for Predicting Transmembrane Protein Structures Using Cryo-em and Evolutionary Data , 2004 .

[52]  G. von Heijne,et al.  Topogenic signals in integral membrane proteins. , 1988, European journal of biochemistry.

[53]  S H White,et al.  MPtopo: A database of membrane protein topology , 2001, Protein science : a publication of the Protein Society.

[54]  Konstantinos D. Tsirigos,et al.  A guideline to proteome‐wide α‐helical membrane protein topology predictions , 2012, Proteomics.

[55]  Jeff A. Bilmes,et al.  Transmembrane Topology and Signal Peptide Prediction Using Dynamic Bayesian Networks , 2008, PLoS Comput. Biol..

[56]  Burkhard Rost,et al.  Supporting online material for : LocTree 2 predicts localization for all domains of life , 2012 .

[57]  S. Brunak,et al.  SignalP 4.0: discriminating signal peptides from transmembrane regions , 2011, Nature Methods.

[58]  Sameer Velankar,et al.  E-MSD: an integrated data resource for bioinformatics , 2004, Nucleic Acids Res..

[59]  B. Rost PHD: predicting one-dimensional protein structure by profile-based neural networks. , 1996, Methods in enzymology.

[60]  Erik L. L. Sonnhammer,et al.  An HMM posterior decoder for sequence feature prediction that includes homology information , 2005, ISMB.

[61]  G. von Heijne The signal peptide. , 1990, The Journal of membrane biology.

[62]  T. Stevens,et al.  Do more complex organisms have a greater proportion of membrane proteins in their genomes? , 2000, Proteins.

[63]  Burkhard Rost,et al.  Static benchmarking of membrane helix predictions , 2003, Nucleic Acids Res..

[64]  Rolf Apweiler,et al.  A collection of well characterised integral membrane proteins , 2000, Bioinform..

[65]  G. von Heijne,et al.  Membrane protein structure: prediction versus reality. , 2007, Annual review of biochemistry.

[66]  Adam Godzik,et al.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences , 2006, Bioinform..

[67]  A. Krogh,et al.  Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. , 2001, Journal of molecular biology.

[68]  John D. Westbrook,et al.  The Protein Model Portal , 2008, Journal of Structural and Functional Genomics.

[69]  Tim Werner,et al.  A benchmark server using high resolution protein structure data, and benchmark results for membrane helix predictions , 2013, BMC Bioinformatics.