Improving fragment quality for de novo structure prediction

De novo structure prediction can be defined as a search in conformational space under the guidance of an energy function. The most successful de novo structure prediction methods, such as Rosetta, assemble the fragments from known structures to reduce the search space. Therefore, the fragment quality is an important factor in structure prediction. In our study, a method is proposed to generate a new set of fragments from the lowest energy de novo models. These fragments were subsequently used to predict the next‐round of models. In a benchmark of 30 proteins, the new set of fragments showed better performance when used to predict de novo structures. The lowest energy model predicted using our method was closer to native structure than Rosetta for 22 proteins. Following a similar trend, the best model among top five lowest energy models predicted using our method was closer to native structure than Rosetta for 20 proteins. In addition, our experiment showed that the C‐alpha root mean square deviation was improved from 5.99 to 5.03 Å on average compared to Rosetta when the lowest energy models were picked as the best predicted models. Proteins 2014; 82:2240–2252. © 2014 Wiley Periodicals, Inc.

[1]  Michael G. Rossmann,et al.  The single isomorphous replacement method , 1961 .

[2]  D. Eisenberg,et al.  An evolutionary approach to folding small alpha-helical proteins that uses sequence information and an empirical guiding fitness function. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[3]  D Baker,et al.  Global properties of the mapping between local amino acid sequence and local structure in proteins. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[4]  C Kooperberg,et al.  Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. , 1997, Journal of molecular biology.

[5]  David A. Agard,et al.  Unfolded conformations of α-lytic protease are more stable than its native state , 1998, Nature.

[6]  D A Agard,et al.  Unfolded conformations of alpha-lytic protease are more stable than its native state. , 1998, Nature.

[7]  Adam Zemla,et al.  LGA: a method for finding 3D similarities in protein structures , 2003, Nucleic Acids Res..

[8]  Liam J McGuffin,et al.  Assembling novel protein folds from super‐secondary structural fragments , 2003, Proteins.

[9]  Lars Malmström,et al.  Automated prediction of CASP‐5 structures using the Robetta server , 2003, Proteins.

[10]  Jonathan Casper,et al.  Combining local‐structure, fold‐recognition, and new fold methods for protein structure prediction , 2003, Proteins.

[11]  Julian Lee,et al.  PROTEINS: Structure, Function, and Bioinformatics 56:704–714 (2004) Prediction of Protein Tertiary Structure Using PROFESY, a Novel Method Based on Fragment Assembly and , 2022 .

[12]  David Baker,et al.  Protein Structure Prediction Using Rosetta , 2004, Numerical Computer Methods, Part D.

[13]  J. Skolnick,et al.  TM-align: a protein structure alignment algorithm based on the TM-score , 2005, Nucleic acids research.

[14]  Oliver Brock,et al.  Improving protein structure prediction with model-based search , 2005, ISMB.

[15]  P. Bradley,et al.  Toward High-Resolution de Novo Structure Prediction for Small Proteins , 2005, Science.

[16]  John Moult,et al.  A decade of CASP: progress, bottlenecks and prognosis in protein structure prediction. , 2005, Current opinion in structural biology.

[17]  Shoji Takada,et al.  SimFold energy function for de novo protein structure prediction: Consensus with Rosetta , 2005, Proteins.

[18]  Anders Krogh,et al.  Sampling Realistic Protein Conformations Using Local Structural Bias , 2006, PLoS Comput. Biol..

[19]  Lars Malmström,et al.  Structure prediction for CASP7 targets using extensive all‐atom refinement with Rosetta@home , 2007, Proteins.

[20]  P. Bradley,et al.  High-resolution structure prediction and the crystallographic phase problem , 2007, Nature.

[21]  A. Liwo,et al.  Computational techniques for efficient conformational sampling of proteins. , 2008, Current opinion in structural biology.

[22]  Shuai Cheng Li,et al.  Fragment‐HMM: A new approach to protein structure prediction , 2008, Protein science : a publication of the Protein Society.

[23]  P. Wolynes,et al.  Restriction versus guidance in protein structure prediction , 2009, Proceedings of the National Academy of Sciences.

[24]  David E. Kim,et al.  Sampling bottlenecks in de novo protein structure prediction. , 2009, Journal of molecular biology.

[25]  David Baker,et al.  Prospects for de novo phasing with de novo protein models , 2009, Acta crystallographica. Section D, Biological crystallography.

[26]  Michael I. Jordan,et al.  Feature space resampling for protein conformational search , 2010, Proteins.

[27]  Yang Zhang,et al.  Automated protein structure modeling in CASP9 by I‐TASSER pipeline combined with QUARK‐based ab initio folding and FG‐MD‐based structure refinement , 2011, Proteins.

[28]  D. Baker,et al.  Alternate states of proteins revealed by detailed energy landscape mapping. , 2011, Journal of molecular biology.

[29]  Michael Habeck,et al.  HHfrag: HMM-based fragment detection using HHpred , 2011, Bioinform..

[30]  Yong Zhou,et al.  Entropy-accelerated exact clustering of protein decoys , 2011, Bioinform..

[31]  Rhiju Das,et al.  Four Small Puzzles That Rosetta Doesn't Solve , 2011, PloS one.

[32]  Kam Y. J. Zhang,et al.  Accelerating ab initio phasing with de novo models. , 2011, Acta crystallographica. Section D, Biological crystallography.

[33]  Daniel W. Kulp,et al.  Generalized Fragment Picking in Rosetta: Design, Protocols and Applications , 2011, PloS one.

[34]  Yang Zhang,et al.  Ab initio protein structure assembly using continuous structure fragments and optimized knowledge‐based force field , 2012, Proteins.

[35]  David Baker,et al.  Role of the Biomolecular Energy Gap in Protein Design, Structure, and Evolution , 2012, Cell.

[36]  Kam Y. J. Zhang,et al.  Error-estimation-guided rebuilding of de novo models increases the success rate of ab initio phasing. , 2012, Acta crystallographica. Section D, Biological crystallography.

[37]  Kam Y. J. Zhang,et al.  A Probabilistic Fragment-Based Protein Structure Prediction Algorithm , 2012, PloS one.

[38]  Kam Y. J. Zhang,et al.  Efficient Sampling in Fragment-Based Protein Structure Prediction Using an Estimation of Distribution Algorithm , 2013, PloS one.

[39]  Dong Xu,et al.  Toward optimal fragment generations for ab initio protein structure assembly , 2013, Proteins.

[40]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .