Topology prediction for helical transmembrane proteins at 86% accuracy–Topology prediction at 86% accuracy

Previously, we introduced a neural network system predicting locations of transmembrane helices (HTMs) based on evolutionary profiles (PHDhtm, Rost B, Casadio R, Fariselli P, Sander C, 1995, Protein Sci 4:521–533). Here, we describe an improvement and an extension of that system. The improvement is achieved by a dynamic programming‐like algorithm that optimizes helices compatible with the neural network output. The extension is the prediction of topology (orientation of first loop region with respect to membrane) by applying to the refined prediction the observation that positively charged residues are more abundant in extra‐cytoplasmic regions. Furthermore, we introduce a method to reduce the number of false positives, i.e., proteins falsely predicted with membrane helices. The evaluation of prediction accuracy is based on a cross‐validation and a double‐blind test set (in total 131 proteins). The final method appears to be more accurate than other methods published: (1) For almost 89% (π3%) of the test proteins, all HTMs are predicted correctly. (2) For more than 86% (π3%) of the proteins, topology is predicted correctly. (3) We define reliability indices that correlate with prediction accuracy: for one half of the proteins, segment accuracy raises to 98%; and for two‐thirds, accuracy of topology prediction is 95%. (4) The rate of proteins for which HTMs are predicted falsely is below 2% (π1%). Finally, the method is applied to 1,616 sequences of Haemophilus influenzae. We predict 19% of the genome sequences to contain one or more HTMs. This appears to be lower than what we predicted previously for the yeast VIII chromosome (about 25%).

[1]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1978, Archives of biochemistry and biophysics.

[2]  G vonHeijne,et al.  Membrane proteins: the amino acid composition of membrane-penetrating segments. , 1981, European journal of biochemistry.

[3]  G. von Heijne Membrane proteins: the amino acid composition of membrane-penetrating segments. , 1981, European journal of biochemistry.

[4]  R. Doolittle,et al.  A simple method for displaying the hydropathic character of a protein. , 1982, Journal of molecular biology.

[5]  P. Argos,et al.  Structural prediction of membrane-bound proteins. , 2005, European journal of biochemistry.

[6]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[7]  D. Eisenberg,et al.  Analysis of membrane and surface protein sequences with the hydrophobic moment plot. , 1984, Journal of molecular biology.

[8]  J. Deisenhofer,et al.  Structure of the protein subunits in the photosynthetic reaction centre of Rhodopseudomonas viridis at 3Å resolution , 1985, Nature.

[9]  T. Steitz,et al.  Identifying nonpolar transbilayer helices in amino acid sequences of membrane proteins. , 1986, Annual review of biophysics and biophysical chemistry.

[10]  G. von Heijne,et al.  A new method for predkting signal sequence cleavage sites , 2022 .

[11]  G. Heijne The distribution of positively charged residues in bacterial inner membrane proteins correlates with the trans‐membrane topology , 1986, The EMBO journal.

[12]  J. Beckwith,et al.  A genetic approach to analyzing membrane protein topology. , 1986, Science.

[13]  C. DeLisi,et al.  Hydrophobicity scales and computational techniques for detecting amphipathic structures in proteins. , 1987, Journal of molecular biology.

[14]  G. von Heijne,et al.  Topogenic signals in integral membrane proteins. , 1988, European journal of biochemistry.

[15]  G. Vonheijne,et al.  Control of topology and mode of assembly of a polytopic membrane protein by positively charged residues , 1989, Nature.

[16]  T A Rapoport,et al.  Predicting the orientation of eukaryotic membrane-spanning proteins. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[17]  S. Karlin,et al.  Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[18]  R. D. Simoni,et al.  A topological analysis of subunit alpha from Escherichia coli F1F0-ATP synthase predicts eight transmembrane segments. , 1990, The Journal of biological chemistry.

[19]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[20]  Gunnar von Heijne,et al.  Fine-tuning the topology of a polytopic membrane protein: Role of positively and negatively charged amino acids , 1990, Cell.

[21]  R. Dalbey Positively charged residues are important determinants of membrane protein topology. , 1990, Trends in biochemical sciences.

[22]  R. Henderson,et al.  Model for the structure of bacteriorhodopsin based on high-resolution electron cryo-microscopy. , 1990, Journal of molecular biology.

[23]  G von Heijne,et al.  Membrane proteins: from sequence to structure. , 1990, Protein engineering.

[24]  Christian Bjørbæk,et al.  The transmembrane topology of the α subunit from the ATPase in Escherichia coli analyzed by PhoA protein fusions , 1990 .

[25]  Jon Beckwith,et al.  The role of charged amino acids in the localization of secreted and membrane proteins , 1990, Cell.

[26]  M. Degli Esposti,et al.  A critical evaluation of the hydropathy profile of membrane proteins. , 1990, European journal of biochemistry.

[27]  C. Sander,et al.  Database of homology‐derived protein structures and the structural meaning of sequence alignment , 1991, Proteins.

[28]  G. Heijne Membrane protein structure prediction. Hydrophobicity analysis and the positive-inside rule. , 1992, Journal of molecular biology.

[29]  B. Dujon,et al.  The complete DNA sequence of yeast chromosome III , 1992, Nature.

[30]  G. Schulz,et al.  Structure of porin refined at 1.8 A resolution. , 1992, Journal of molecular biology.

[31]  Differentiation between transmembrane helices and peripheral helices by the deconvolution of circular dichroism spectra of membrane proteins , 1992, Protein science : a publication of the Protein Society.

[32]  C. Deber,et al.  Non-random distribution of amino acids in the transmembrane segments of human type I single span membrane proteins. , 1993, Journal of molecular biology.

[33]  B. Rost,et al.  Secondary structure prediction of all-helical proteins in two states. , 1993, Protein engineering.

[34]  B. Rost,et al.  Prediction of protein secondary structure at better than 70% accuracy. , 1993, Journal of molecular biology.

[35]  J. Broome-Smith,et al.  Gene-fusion techniques for determining membrane-protein topology , 1993 .

[36]  S. Vries,et al.  Mitochondrial cytochrome b: evolution and structure of the protein. , 1993, Biochimica et biophysica acta.

[37]  G. von Heijne,et al.  Predicting the topology of eukaryotic membrane proteins. , 1993, European journal of biochemistry.

[38]  J Edelman,et al.  Quadratic minimization of predictors for protein secondary structure. Application to transmembrane alpha-helices. , 1993, Journal of molecular biology.

[39]  Terri L. Gilbert,et al.  The ligand-binding domain in metabotropic glutamate receptors is related to bacterial periplasmic binding proteins , 1993, Neuron.

[40]  John P. Overington,et al.  Modeling α‐helical transmembrane domains: The calculation and use of substitution tables for lipid‐facing residues , 1993, Protein science : a publication of the Protein Society.

[41]  Zhi-Xin Wang Assessing the accuracy of protein secondary structure , 1994, Nature Structural Biology.

[42]  A. Bairoch,et al.  The SWISS-PROT protein sequence data bank: current status. , 1994, Nucleic acids research.

[43]  W. Taylor,et al.  Structure, transmembrane topology and helix packing of P‐type ion pumps , 1994, FEBS Letters.

[44]  A. Bairoch The ENZYME data bank. , 1993, Nucleic acids research.

[45]  Burkhard Rost,et al.  PHD - an automatic mail server for protein secondary structure prediction , 1994, Comput. Appl. Biosci..

[46]  D. T. Jones,et al.  A method for alpha-helical integral membrane protein fold prediction. , 1994, Proteins.

[47]  W R Taylor,et al.  A model recognition approach to the prediction of all-helical membrane protein structure and topology. , 1994, Biochemistry.

[48]  F. Hucho,et al.  Beta-structure in the membrane-spanning part of the nicotinic acetylcholine receptor (or how helical are transmembrane helices?). , 1994, Trends in biochemical sciences.

[49]  B. Rost,et al.  Redefining the goals of protein secondary structure prediction. , 1994, Journal of molecular biology.

[50]  Jonathan A. Cooper,et al.  Complete nucleotide sequence of Saccharomyces cerevisiae chromosome VIII. , 1994, Science.

[51]  G. Schulz,et al.  Refined structure of the porin from Rhodopseudomonas blastica. Comparison with the porin from Rhodobacter capsulatus. , 1994, Journal of molecular biology.

[52]  C. Sander,et al.  The HSSP database of protein structure-sequence alignments. , 1994, Nucleic acids research.

[53]  C Sander,et al.  Structure prediction of proteins--where are we now? , 1994, Current opinion in biotechnology.

[54]  P Argos,et al.  Prediction of transmembrane segments in proteins utilising multiple sequence alignments. , 1994, Journal of molecular biology.

[55]  J. Rosenbusch,et al.  Folding pattern diversity of integral membrane proteins. , 1994, Science.

[56]  Hartmut Michel,et al.  Structure at 2.8 Å resolution of cytochrome c oxidase from Paracoccus denitrificans , 1995, Nature.

[57]  R. Fleischmann,et al.  Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. , 1995, Science.

[58]  Søren Brunak,et al.  Protein Folds: A Distance-Based Approach , 1995 .

[59]  B Rost,et al.  Progress of 1D protein structure prediction at last , 1995, Proteins.

[60]  Burkhard Rost,et al.  TOPITS: Threading One-Dimensional Predictions Into Three-Dimensional Structures , 1995, ISMB.

[61]  B. Rost,et al.  Transmembrane helices predicted at 95% accuracy , 1995, Protein science : a publication of the Protein Society.

[62]  C. Sander,et al.  Challenging times for bioinformatics , 1995, Nature.

[63]  B. Rost PHD: predicting one-dimensional protein structure by profile-based neural networks. , 1996, Methods in enzymology.

[64]  Burkhard Rost,et al.  Refining Neural Network Predictions for Helical Transmembrane Proteins by Dynamic Programming , 1996, ISMB.

[65]  Piero Fariselli,et al.  HTP: a neural network-based method for predicting the topology of helical transmembrane domains in proteins , 1996, Comput. Appl. Biosci..

[66]  P Fariselli,et al.  A predictor of transmembrane alpha-helix domains of proteins based on neural networks. , 1996, European biophysics journal : EBJ.