Integrating multiple scoring functions to improve protein loop structure conformation space sampling

In this article, we present a new protein structure modeling approach based on multi-scoring functions sampling. The rationale is to integrate multiple carefully-selected physics-or knowledge-based scoring functions to tolerate insensitivity and inaccuracy existing in an individual scoring function so as to improve protein structure modeling accuracy. We apply the multi-scoring function sampling approach to protein loop backbone structure modeling. Our computational results show that sampling the scoring function space of a physics-based soft-sphere potential function and a knowledge-based scoring function based on pairwise atoms distance has led to resolution improvement in the predicted decoy populations in a set of 12-residue benchmark loop targets.

[1]  G. Klebe,et al.  Statistical potentials and scoring functions applied to protein-ligand binding. , 2001, Current opinion in structural biology.

[2]  W H Wong,et al.  Dynamic weighting in Monte Carlo and optimization. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[3]  C. Anfinsen Principles that govern the folding of protein chains. , 1973, Science.

[4]  Yaoqi Zhou,et al.  Ab initio folding of terminal segments with secondary structures reveals the fine difference between two closely related all‐atom statistical energy functions , 2008, Protein science : a publication of the Protein Society.

[5]  David Baker,et al.  Protein Structure Prediction Using Rosetta , 2004, Numerical Computer Methods, Part D.

[6]  A. Ben-Naim STATISTICAL POTENTIALS EXTRACTED FROM PROTEIN STRUCTURES : ARE THESE MEANINGFUL POTENTIALS? , 1997 .

[7]  Adrian A Canutescu,et al.  Cyclic coordinate descent: A robotics algorithm for protein loop closure , 2003, Protein science : a publication of the Protein Society.

[8]  A. Sali,et al.  Statistical potential for assessment and prediction of protein structures , 2006, Protein science : a publication of the Protein Society.

[9]  William L. Jorgensen,et al.  OPLS all‐atom force field for carbohydrates , 1997 .

[10]  A Mitsutake,et al.  Generalized-ensemble algorithms for molecular simulations of biopolymers. , 2000, Biopolymers.

[11]  A Godzik,et al.  Knowledge-based potentials for protein folding: what can we learn from known protein structures? , 1996, Structure.

[12]  Chaok Seok,et al.  A kinematic view of loop closure , 2004, J. Comput. Chem..

[13]  Ajay N. Jain Scoring noncovalent protein-ligand interactions: A continuous differentiable function tuned to compute binding affinities , 1996, J. Comput. Aided Mol. Des..

[14]  Joshua D. Knowles,et al.  Investigations into the Effect of Multiobjectivization in Protein Structure Prediction , 2008, PPSN.

[15]  B. Honig,et al.  A hierarchical approach to all‐atom protein loop prediction , 2004, Proteins.

[16]  Luhua Lai,et al.  A fast and efficient program for modeling protein loops , 1997 .

[17]  Hans-Joachim Böhm,et al.  The development of a simple empirical scoring function to estimate the binding constant for a protein-ligand complex of known three-dimensional structure , 1994, J. Comput. Aided Mol. Des..

[18]  M. Karplus,et al.  CHARMM: A program for macromolecular energy, minimization, and dynamics calculations , 1983 .

[19]  K. Dill,et al.  Statistical potentials extracted from protein structures: how accurate are they? , 1996, Journal of molecular biology.

[20]  S. Wodak,et al.  Factors influencing the ability of knowledge-based potentials to identify native sequence-structure matches. , 1994, Journal of molecular biology.

[21]  R. Friesner,et al.  Long loop prediction using the protein local optimization program , 2006, Proteins.

[22]  R. K. Ursem Multi-objective Optimization using Evolutionary Algorithms , 2009 .

[23]  J. Onuchic,et al.  Funnels, pathways, and the energy landscape of protein folding: A synthesis , 1994, Proteins.

[24]  Yaohang Li,et al.  Extensive exploration of conformational space improves Rosetta results for short protein domains. , 2008, Computational systems bioinformatics. Computational Systems Bioinformatics Conference.

[25]  P. Kollman,et al.  A Second Generation Force Field for the Simulation of Proteins, Nucleic Acids, and Organic Molecules J. Am. Chem. Soc. 1995, 117, 5179−5197 , 1996 .

[26]  A Rojnuckarin,et al.  Knowledge‐based interaction potentials for proteins , 1999, Proteins.