Contribution to the Prediction of the Fold Code: Application to Immunoglobulin and Flavodoxin Cases

Background Folding nucleus of globular proteins formation starts by the mutual interaction of a group of hydrophobic amino acids whose close contacts allow subsequent formation and stability of the 3D structure. These early steps can be predicted by simulation of the folding process through a Monte Carlo (MC) coarse grain model in a discrete space. We previously defined MIRs (Most Interacting Residues), as the set of residues presenting a large number of non-covalent neighbour interactions during such simulation. MIRs are good candidates to define the minimal number of residues giving rise to a given fold instead of another one, although their proportion is rather high, typically [15-20]% of the sequences. Having in mind experiments with two sequences of very high levels of sequence identity (up to 90%) but different folds, we combined the MIR method, which takes sequence as single input, with the “fuzzy oil drop” (FOD) model that requires a 3D structure, in order to estimate the residues coding for the fold. FOD assumes that a globular protein follows an idealised 3D Gaussian distribution of hydrophobicity density, with the maximum in the centre and minima at the surface of the “drop”. If the actual local density of hydrophobicity around a given amino acid is as high as the ideal one, then this amino acid is assigned to the core of the globular protein, and it is assumed to follow the FOD model. Therefore one obtains a distribution of the amino acids of a protein according to their agreement or rejection with the FOD model. Results We compared and combined MIR and FOD methods to define the minimal nucleus, or keystone, of two populated folds: immunoglobulin-like (Ig) and flavodoxins (Flav). The combination of these two approaches defines some positions both predicted as a MIR and assigned as accordant with the FOD model. It is shown here that for these two folds, the intersection of the predicted sets of residues significantly differs from random selection. It reduces the number of selected residues by each individual method and allows a reasonable agreement with experimentally determined key residues coding for the particular fold. In addition, the intersection of the two methods significantly increases the specificity of the prediction, providing a robust set of residues that constitute the folding nucleus.

[1]  M. Levitt A simplified representation of protein conformations for rapid simulation of protein folding. , 1976, Journal of molecular biology.

[2]  Anna Tramontano,et al.  ProCoCoA: A quantitative approach for analyzing protein core composition , 2013, Comput. Biol. Chem..

[3]  Antonio Turi,et al.  Distance-dependent hydrophobic-hydrophobic contacts in protein folding simulations. , 2014, Physical chemistry chemical physics : PCCP.

[4]  D. Thirumalai,et al.  The nucleation-collapse mechanism in protein folding: evidence for the non-uniqueness of the folding nucleus. , 1997, Folding & design.

[5]  Irena Roterman-Konieczna,et al.  Identification of Ligand Binding Site and Protein-Protein Interaction Area , 2013 .

[6]  José N Onuchic,et al.  The shadow map: a general contact definition for capturing the dynamics of biomolecular folding and function. , 2012, The journal of physical chemistry. B.

[7]  J. Sancho,et al.  Flavodoxins: sequence, folding, binding, function and beyond , 2006, Cellular and Molecular Life Sciences CMLS.

[8]  J. Chomilier,et al.  Prediction of the protein folding core: application to the immunoglobulin fold. , 2009, Biochimie.

[9]  Jacques Chomilier,et al.  Universal positions in globular proteins. , 2004, European journal of biochemistry.

[10]  J. Mornon,et al.  A new protein folding algorithm based on hydrophobic compactness: Rigid Unconnected Secondary Structure Iterative Assembly (RUSSIA). II: Applications. , 2003, Protein engineering.

[11]  C. V. van Mierlo,et al.  Protein topology affects the appearance of intermediates during the folding of proteins with a flavodoxin-like fold. , 2005, Biophysical chemistry.

[12]  R. Lipsitz,et al.  Specific non-native hydrophobic interactions in a hidden folding intermediate: implications for protein folding. , 2003, Biochemistry.

[13]  C. Sander,et al.  Protein structure comparison by alignment of distance matrices. , 1993, Journal of molecular biology.

[14]  Christopher J. R. Illingworth,et al.  Connectivity and binding‐site recognition: Applications relevant to drug design , 2010, J. Comput. Chem..

[15]  Serrano,et al.  Structure of the transition state for folding of the 129 aa protein CheY resembles that of a smaller protein, CI-2. , 1995, Folding & design.

[16]  J. Mornon,et al.  Analysis of fragments induced by simulated lattice protein folding. , 2004, Comptes rendus biologies.

[17]  A. W. Kemp,et al.  Univariate Discrete Distributions , 1993 .

[18]  Liam M Longo,et al.  Alternative folding nuclei definitions facilitate the evolution of a symmetric protein fold from a smaller peptide motif. , 2013, Structure.

[19]  Eduardo P. C. Rocha,et al.  Alternative to homo-oligomerisation: the creation of local symmetry in proteins by internal amplification. , 2009, Journal of molecular biology.

[20]  Peter J. Stuckey,et al.  MUSTANG-MR Structural Sieving Server: Applications in Protein Structural Analysis and Crystallography , 2010, PloS one.

[21]  Solution structures and backbone dynamics of the ribosomal protein S6 and its permutant P54‐55 , 2009, Protein science : a publication of the Protein Society.

[22]  J. Mornon,et al.  A new protein folding algorithm based on hydrophobic compactness: Rigid Unconnected Secondary Structure Iterative Assembly (RUSSIA). I: Methodology. , 2003, Protein Engineering.

[23]  Zoé Lacroix,et al.  SPROUTS: a database for the evaluation of protein stability upon point mutation , 2008, Nucleic Acids Res..

[24]  C. Floudas,et al.  Towards accurate residue–residue hydrophobic contact prediction for α helical proteins via integer linear optimization , 2009, Proteins.

[25]  A. Fersht,et al.  Phi-value analysis and the nature of protein-folding transition states. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[26]  R. Jernigan,et al.  Residue-residue potentials with a favorable contact pair term and an unfavorable high packing density term, for simulation and threading. , 1996, Journal of molecular biology.

[27]  Ian Sillitoe,et al.  The CATH classification revisited—architectures reviewed and new ways to characterize structural divergence in superfamilies , 2008, Nucleic Acids Res..

[28]  J. Clarke,et al.  Crosstalk between the Protein Surface and Hydrophobic Core in a Core-swapped Fibronectin Type III Domain , 2008, Journal of molecular biology.

[29]  O. Galzitskaya,et al.  Nucleation‐based prediction of the protein folding rate and its correlation with the folding nucleus size , 2012, Proteins.

[30]  L A Mirny,et al.  Universality and diversity of the protein folding scenarios: a comprehensive analysis with the aid of a lattice model. , 1996, Folding & design.

[31]  Anne Poupon,et al.  Predicting the protein folding nucleus from a sequence , 1999, FEBS letters.

[32]  Jeanette Tångrot,et al.  Complete change of the protein folding transition state upon circular permutation , 2002, Nature Structural Biology.

[33]  A. W. Kemp,et al.  Univariate Discrete Distributions: Johnson/Univariate Discrete Distributions , 2005 .

[34]  A. Poupon,et al.  “Topohydrophobic positions” as key markers of globular protein folds , 1999 .

[35]  I. Gelfand,et al.  Finding of residues crucial for supersecondary structure formation , 2009, Proceedings of the National Academy of Sciences.

[36]  A. Fersht Nucleation mechanisms in protein folding. , 1997, Current opinion in structural biology.

[37]  J. Clarke,et al.  The folding of an immunoglobulin-like Greek key protein is defined by a common-core nucleus and regions constrained by topology. , 2000, Journal of molecular biology.

[38]  J. Onuchic,et al.  An all‐atom structure‐based potential for proteins: Bridging minimal models with all‐atom empirical forcefields , 2009, Proteins.

[39]  E. Cota,et al.  Folding studies of immunoglobulin-like beta-sandwich proteins suggest that they share a common folding pathway. , 1999, Structure.

[40]  Marta Bueno,et al.  Do proteins with similar folds have similar transition state structures? A diffuse transition state of the 169 residue apoflavodoxin. , 2006, Journal of molecular biology.

[41]  Irena Roterman-Konieczna,et al.  Comparative Analysis of Techniques Oriented on the Recognition of Ligand Binding Area in Proteins , 2013 .

[42]  J. Clarke,et al.  Mapping the folding pathway of an immunoglobulin domain: structural detail from Phi value analysis and movement of the transition state. , 2001, Structure.

[43]  Irena Roterman-Konieczna,et al.  Is the protein folding an aim-oriented process? Human haemoglobin as example , 2007, Int. J. Bioinform. Res. Appl..

[44]  Marcin J. Skwark,et al.  PconsFold: improved contact predictions improve protein models , 2014, Bioinform..

[45]  V. Muñoz,et al.  A simple model for calculating the kinetics of protein folding from three-dimensional structures. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[46]  Mingjie Zhang,et al.  Decorating proteins with "sweets" is a flexible matter. , 2013, Structure.

[47]  John Orban,et al.  Proteins that switch folds. , 2010, Current opinion in structural biology.

[48]  Kevin W Plaxco,et al.  Residues participating in the protein folding nucleus do not exhibit preferential evolutionary conservation. , 2002, Journal of molecular biology.

[49]  O B Ptitsyn How does protein synthesis give rise to the 3D‐structure? , 1991, FEBS letters.

[50]  J. Skolnick,et al.  MONSSTER: a method for folding globular proteins with a small number of distance restraints. , 1997, Journal of molecular biology.

[51]  Rama Ranganathan,et al.  Local complexity of amino acid interactions in a protein core , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[52]  A. Finkelstein,et al.  A theoretical search for folding/unfolding nuclei in three-dimensional protein structures. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[53]  N. Grishin,et al.  Alternate pathways for folding in the flavodoxin fold family revealed by a nucleation-growth model. , 2006, Journal of Molecular Biology.

[54]  Jane R. Allison,et al.  Current computer modeling cannot explain why two highly similar sequences fold into different structures. , 2011, Biochemistry.

[55]  Zoé Lacroix,et al.  Protein intrachain contact prediction with most interacting residues (MIR) , 2014, Bio Algorithms Med Syst..

[56]  T. Kiefhaber,et al.  Origin of unusual phi-values in protein folding: evidence against specific nucleation sites. , 2003, Journal of molecular biology.

[57]  S. O. Garbuzynskiy,et al.  Structural features of protein folding nuclei , 2008, FEBS letters.

[58]  Massimiliano Pontil,et al.  PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments , 2012, Bioinform..

[59]  Gerhard Hummer,et al.  Native contacts determine protein folding mechanisms in atomistic simulations , 2013, Proceedings of the National Academy of Sciences.

[60]  J. Clarke,et al.  Plasticity Within the Obligatory Folding Nucleus of an Immunoglobulin-like Domain , 2008, Journal of molecular biology.

[61]  A Kolinski,et al.  Dynamic Monte Carlo simulations of a new lattice model of globular protein folding, structure and dynamics. , 1991, Journal of molecular biology.

[62]  Rahul Raman,et al.  Atomic Interaction Networks in the Core of Protein Domains and Their Native Folds , 2010, PloS one.

[63]  L. Mirny,et al.  Universally conserved positions in protein folds: reading evolutionary signals about stability, folding kinetics and function. , 1999, Journal of molecular biology.

[64]  E. Cota,et al.  The folding nucleus of a fibronectin type III domain is composed of core residues of the immunoglobulin-like fold. , 2001, Journal of molecular biology.

[65]  I. Roterman,et al.  Hydrophobic core in domains of immunoglobulin-like fold , 2014, Journal of biomolecular structure & dynamics.

[66]  A. Fersht Optimization of rates of protein folding: the nucleation-condensation mechanism and its implications. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[67]  A. Fersht,et al.  The structure of the transition state for folding of chymotrypsin inhibitor 2 analysed by protein engineering methods: evidence for a nucleation-condensation mechanism for protein folding. , 1995, Journal of molecular biology.

[68]  Jane Clarke,et al.  What lessons can be learned from studying the folding of homologous proteins? , 2010, Methods.

[69]  Michele Vendruscolo,et al.  A PDZ domain recapitulates a unifying mechanism for protein folding , 2007, Proceedings of the National Academy of Sciences.

[70]  M. Gerstein,et al.  Average core structures and variability measures for protein families: application to the immunoglobulins. , 1995, Journal of molecular biology.

[71]  F. Melo,et al.  Novel knowledge-based mean force potential at atomic level. , 1997, Journal of molecular biology.

[72]  Irena Roterman-Konieczna,et al.  Ligand-binding-site recognition , 2012 .

[73]  J. Skolnick,et al.  Static and dynamic properties of a new lattice model of polypeptide chains , 1991 .

[74]  G. Rose,et al.  Protein folding: Predicting predicting , 1994, Proteins.

[75]  Z. Lacroix,et al.  SMIR: a method to predict the residues involved in the core of a protein , 2014 .

[76]  P. Alexander,et al.  A minimal sequence code for switching protein structure and function , 2009, Proceedings of the National Academy of Sciences.

[77]  Irena Roterman-Konieczna,et al.  Gauss-Function-Based Model of Hydrophobicity Density in Proteins , 2006, Silico Biol..

[78]  A. Finkelstein,et al.  Folding of circular permutants with decreased contact order: general trend balanced by protein stability. , 2001, Journal of molecular biology.

[79]  M. A. A. Barbosa,et al.  Non‐native interactions, effective contact order, and protein folding: A mutational investigation with the energetically frustrated hydrophobic model , 2002, Proteins.

[80]  A. Poupon,et al.  Populations of hydrophobic amino acids within protein globular domains: Identification of conserved “topohydrophobic” positions , 1998, Proteins.

[81]  Jens Meiler,et al.  CASP6 assessment of contact prediction , 2005, Proteins.

[82]  R. Guzzi,et al.  Molecular dynamics of amicyanin reveals a conserved dynamical core for blue copper proteins , 2009, Proteins.

[83]  Folding of the human protein FKBP. Lattice Monte-Carlo simulations. , 1998, Comptes rendus de l'Academie des sciences. Serie III, Sciences de la vie.

[84]  Charlotte M. Deane,et al.  SCORE: predicting the core of protein models , 2001, Bioinform..

[85]  Ellinor Haglund,et al.  Changes of Protein Folding Pathways by Circular Permutation , 2008, Journal of Biological Chemistry.

[86]  John Orban,et al.  Mutational tipping points for switching protein folds and functions. , 2012, Structure.