On the derivation of propensity scales for predicting exposed transmembrane residues of helical membrane proteins

Helical membrane proteins (HMPs) play a crucial role in diverse physiological processes. Given the difficulty in determining their structures by experimental techniques, it is desired to develop computational methods for predicting the burial status of transmembrane residues. Deriving a propensity scale for the 20 amino acids to be exposed to the lipid bilayer from known structures is central to developing such methods. A fundamental problem in this regard is what would be the optimal way of deriving propensity scales. Here, we show that this problem can be reformulated such that an optimal scale is straightforwardly obtained in an analytical fashion. The derived scale favorably compares with others in terms of both algorithmic optimality and practical prediction accuracy. It also allows interesting insights into the structural organization of HMPs. Furthermore, the presented approach can be applied to other bioinformatics problems of HMPs, too. All the data sets and programs used in the study and detailed primary results are available upon request.

[1]  Jie Liang,et al.  Prediction of transmembrane helix orientation in polytopic membrane proteins , 2006, BMC Structural Biology.

[2]  John P. Overington,et al.  Modeling α‐helical transmembrane domains: The calculation and use of substitution tables for lipid‐facing residues , 1993, Protein science : a publication of the Protein Society.

[3]  P. Baldi,et al.  Prediction of coordination number and relative solvent accessibility in proteins , 2002, Proteins.

[4]  T. Stevens,et al.  Substitution rates in alpha-helical transmembrane proteins. , 2001, Protein science : a publication of the Protein Society.

[5]  E. Berry,et al.  Binding of the respiratory chain inhibitor antimycin to the mitochondrial bc1 complex: a new crystal structure reveals an altered intramolecular hydrogen-bonding pattern. , 2005, Journal of molecular biology.

[6]  S. Henikoff,et al.  Position-based sequence weights. , 1994, Journal of molecular biology.

[7]  D. Eisenberg,et al.  Analysis of membrane and surface protein sequences with the hydrophobic moment plot. , 1984, Journal of molecular biology.

[8]  R. Doolittle,et al.  A simple method for displaying the hydropathic character of a protein. , 1982, Journal of molecular biology.

[9]  H. Edelsbrunner The union of balls and its dual shape , 1995 .

[10]  Thijs Beuming,et al.  A knowledge-based scale for the analysis and prediction of buried and exposed faces of transmembrane domain proteins , 2004, Bioinform..

[11]  G. Heijne,et al.  Genome‐wide analysis of integral membrane proteins from eubacterial, archaean, and eukaryotic organisms , 1998, Protein science : a publication of the Protein Society.

[12]  M. Gromiha,et al.  Real value prediction of solvent accessibility from amino acid sequence , 2003, Proteins.

[13]  Xian-Ming Pan,et al.  New method for accurate prediction of solvent accessibility from protein sequence , 2001, Proteins.

[14]  G. Heijne,et al.  Recognition of transmembrane helices by the endoplasmic reticulum translocon , 2005, Nature.

[15]  L Serrano,et al.  Effect of active site residues in barnase on activity and stability. , 1992, Journal of molecular biology.

[16]  Andrei L. Lomize,et al.  OPM: Orientations of Proteins in Membranes database , 2006, Bioinform..

[17]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[18]  T. Steitz,et al.  Identifying nonpolar transbilayer helices in amino acid sequences of membrane proteins. , 1986, Annual review of biophysics and biophysical chemistry.

[19]  Jimin Pei,et al.  AL2CO: calculation of positional conservation in a protein sequence alignment , 2001, Bioinform..

[20]  M. Gerstein,et al.  Genomic analysis of membrane protein families: abundance and conserved motifs , 2002, Genome Biology.

[21]  Andrei L Lomize,et al.  Positioning of proteins in membranes: A computational approach , 2006, Protein science : a publication of the Protein Society.

[22]  Duan Yang,et al.  Side-chain contributions to membrane protein structure and stability. , 2004, Journal of molecular biology.

[23]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[24]  B. Rost,et al.  Conservation and prediction of solvent accessibility in protein families , 1994, Proteins.

[25]  Kurt Hornik,et al.  Support Vector Machines in R , 2006 .

[26]  T O Yeates,et al.  Structure of the reaction center from Rhodobacter sphaeroides R-26: membrane-protein interactions. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[27]  J. Kirkwood,et al.  Proteins, amino acids and peptides as ions and dipolar ions , 1943 .

[28]  S. Wise,et al.  kPROT: A Knowledge-based Scale for the Propensity of Residue Orientation in Transmembrane Segments. Application to Membrane Protein Structure Prediction , 1999 .

[29]  G Schreiber,et al.  Stability and function: two constraints in the evolution of barstar and other proteins. , 1994, Structure.

[30]  B. Rost,et al.  State-of-the-art in membrane protein prediction. , 2002, Applied bioinformatics.

[31]  R A Goldstein,et al.  Predicting solvent accessibility: Higher accuracy using Bayesian statistics and optimized residue substitution classes , 1996, Proteins.

[32]  Jie Liang,et al.  Empirical lipid propensities of amino acid residues in multispan alpha helical membrane proteins , 2005, Proteins.

[33]  S. White,et al.  Membrane protein folding and stability: physical principles. , 1999, Annual review of biophysics and biomolecular structure.

[34]  Herbert Edelsbrunner,et al.  Measuring proteins and voids in proteins , 1995, Proceedings of the Twenty-Eighth Annual Hawaii International Conference on System Sciences.

[35]  Huan‐Xiang Zhou,et al.  Prediction of solvent accessibility and sites of deleterious mutations from protein sequence , 2005, Nucleic acids research.

[36]  Yungki Park,et al.  How strongly do sequence conservation patterns and empirical scales correlate with exposure patterns of transmembrane helices of membrane proteins? , 2006, Biopolymers.

[37]  Seung-Yeon Kim,et al.  Prediction of protein solvent accessibility using fuzzy k-nearest neighbor method , 2005, Bioinform..

[38]  J. M. Zimmerman,et al.  The characterization of amino acid sequences in proteins by statistical methods. , 1968, Journal of theoretical biology.

[39]  T. Creamer,et al.  Solvation energies of amino acid side chains and backbone in a family of host-guest pentapeptides. , 1996, Biochemistry.

[40]  Aleksey A. Porollo,et al.  Accurate prediction of solvent accessibility using neural networks–based regression , 2004, Proteins.

[41]  Chih-Jen Lin,et al.  A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.

[42]  L. Gierasch,et al.  Mutating the charged residues in the binding pocket of cellular retinoic acid‐binding protein simultaneously reduces its binding affinity to retinoic acid and increases its thermostability , 1992, Proteins.

[43]  J. Baldwin,et al.  An alpha-carbon template for the transmembrane helices in the rhodopsin family of G-protein-coupled receptors. , 1997, Journal of molecular biology.

[44]  Tim J. Stevens,et al.  Substitution rates in α‐helical transmembrane proteins , 2001 .

[45]  C. Pace,et al.  Forces contributing to the conformational stability of proteins , 1996, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[46]  B K Shoichet,et al.  A relationship between protein stability and protein function. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[47]  Jagath C Rajapakse,et al.  Two‐stage support vector regression approach for predicting accessible surface areas of amino acids , 2006, Proteins.