Exploiting homology information in nontemplate based prediction of protein structures.

In this paper we describe a novel strategy for exploring the conformational space of proteins and show that this leads to better models for proteins the structure of which is not amenable to template based methods. Our strategy is based on the assumption that the energy global minimum of homologous proteins must correspond to similar conformations, while the precise profiles of their energy landscape, and consequently the positions of the local minima, are likely to be different. In line with this hypothesis, we apply a replica exchange Monte Carlo simulation protocol that, rather than using different parameters for each parallel simulation, uses the sequences of homologous proteins. We show that our results are competitive with respect to alternative methods, including those producing the best model for each of the analyzed targets in the CASP10 (10th Critical Assessment of techniques for protein Structure Prediction) experiment free modeling category.

[1]  A. Lesk,et al.  The relation between the divergence of sequence and structure in proteins. , 1986, The EMBO journal.

[2]  David Baker,et al.  Protein Structure Prediction Using Rosetta , 2004, Numerical Computer Methods, Part D.

[3]  Richard Bonneau,et al.  Improving the performance of rosetta using multiple sequence alignment information and global measures of hydrophobic core formation , 2001, Proteins.

[4]  J. Skolnick,et al.  Ab initio folding of proteins using restraints derived from evolutionary information , 1999, Proteins.

[5]  C. Sander,et al.  Database of homology‐derived protein structures and the structural meaning of sequence alignment , 1991, Proteins.

[6]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[7]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[8]  Michael Levitt,et al.  Generalized ensemble methods for de novo structure prediction , 2009, Proceedings of the National Academy of Sciences.

[9]  B. Berne,et al.  Replica exchange with solute tempering: a method for sampling biological systems in explicit water. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[10]  C. Sander,et al.  Direct-coupling analysis of residue coevolution captures native contacts across many protein families , 2011, Proceedings of the National Academy of Sciences.

[11]  David Baker,et al.  Protein structure prediction and analysis using the Robetta server , 2004, Nucleic Acids Res..

[12]  J. Skolnick,et al.  Local energy landscape flattening: Parallel hyperbolic Monte Carlo sampling of protein folding , 2002, Proteins.

[13]  Krzysztof Fidelis,et al.  CASP prediction center infrastructure and evaluation measures in CASP10 and CASP ROLL , 2014, Proteins.

[14]  Wang,et al.  Replica Monte Carlo simulation of spin glasses. , 1986, Physical review letters.

[15]  Zhengwei Zhu,et al.  CD-HIT: accelerated for clustering the next-generation sequencing data , 2012, Bioinform..

[16]  Hongjun Bai,et al.  Assessment of template‐free modeling in CASP10 and ROLL , 2014, Proteins.

[17]  S. Takada,et al.  On the Hamiltonian replica exchange method for efficient sampling of biomolecular systems: Application to protein structure prediction , 2002 .

[18]  Yuko Okamoto,et al.  Generalized-ensemble algorithms: enhanced sampling techniques for Monte Carlo and molecular dynamics simulations. , 2003, Journal of molecular graphics & modelling.

[19]  J. D. de Pablo,et al.  Optimal allocation of replicas in parallel tempering simulations. , 2005, The Journal of chemical physics.

[20]  Yang Zhang,et al.  TASSER: An automated method for the prediction of protein tertiary structures in CASP6 , 2005, Proteins.

[21]  J Skolnick,et al.  Coupling the folding of homologous proteins. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[22]  K. Katoh,et al.  MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. , 2002, Nucleic acids research.

[23]  M. Sternberg,et al.  Recognition of analogous and homologous protein folds: analysis of sequence and structure conservation. , 1997, Journal of molecular biology.

[24]  Xiaotao Qu,et al.  A guide to template based structure prediction. , 2009, Current protein & peptide science.

[25]  P. Bradley,et al.  Toward High-Resolution de Novo Structure Prediction for Small Proteins , 2005, Science.

[26]  A. Sali,et al.  Comparative protein structure modeling of genes and genomes. , 2000, Annual review of biophysics and biomolecular structure.

[27]  E. Querol,et al.  Identification of function-associated loop motifs and application to protein function prediction , 2006, Bioinform..

[28]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[29]  Anna Tramontano,et al.  Critical assessment of methods of protein structure prediction (CASP) — round x , 2014, Proteins.

[30]  Sergey Lyskov,et al.  PyRosetta: a script-based interface for implementing molecular modeling algorithms using Rosetta , 2010, Bioinform..