Improvement of comparative model accuracy by free-energy optimization along principal components of natural structural variation.

Accurate high-resolution refinement of protein structure models is a formidable challenge because of the delicate balance of forces in the native state, the difficulty in sampling the very large number of alternative tightly packed conformations, and the inaccuracies in current force fields. Indeed, energy-based refinement of comparative models generally leads to degradation rather than improvement in model quality, and, hence, most current comparative modeling procedures omit physically based refinement. However, despite their inaccuracies, current force fields do contain information that is orthogonal to the evolutionary information on which comparative models are based, and, hence, refinement might be able to improve comparative models if the space that is sampled is restricted sufficiently so that false attractors are avoided. Here, we use the principal components of the variation of backbone structures within a homologous family to define a small number of evolutionarily favored sampling directions and show that model quality can be improved by energy-based optimization along these directions.

[1]  Eliah Aronoff-Spencer,et al.  The structure of the inter-SH2 domain of class IA phosphoinositide 3-kinase determined by site-directed spin labeling EPR and homology modeling , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[2]  M. Karplus,et al.  Effective energy function for proteins in solution , 1999, Proteins.

[3]  William H. Press,et al.  Numerical recipes in C , 2002 .

[4]  D. Baker,et al.  An orientation-dependent hydrogen bonding potential improves prediction of specificity and structure for proteins and protein-protein complexes. , 2003, Journal of molecular biology.

[5]  C. Toyoshima,et al.  Homology modeling of the cation binding sites of Na+K+-ATPase , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[6]  A. Sali,et al.  Comparative protein structure modeling by iterative alignment, model building and model assessment. , 2003, Nucleic acids research.

[7]  Thomas Lengauer,et al.  Novel technologies for virtual screening. , 2004, Drug discovery today.

[8]  Roland L. Dunbrack,et al.  Bayesian statistical analysis of protein side‐chain rotamer preferences , 1997, Protein science : a publication of the Protein Society.

[9]  Jeffrey J. Gray,et al.  Protein-protein docking with simultaneous optimization of rigid-body displacement and side-chain conformations. , 2003, Journal of molecular biology.

[10]  Osvaldo Olmea,et al.  MAMMOTH (Matching molecular models obtained from theory): An automated method for model comparison , 2002, Protein science : a publication of the Protein Society.

[11]  D. Baker,et al.  Native protein sequences are close to optimal for their structures. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[12]  A. Mclachlan Gene duplications in the structural evolution of chymotrypsin. , 1979, Journal of molecular biology.

[13]  Jacquelyn S. Fetrow,et al.  Structural genomics and its importance for gene function analysis , 2000, Nature Biotechnology.

[14]  Randy J. Read,et al.  Pushing the boundaries of molecular replacement with maximum likelihood. , 2001, Acta crystallographica. Section D, Biological crystallography.

[15]  R. Stevens,et al.  Global Efforts in Structural Genomics , 2001, Science.

[16]  Richard Bonneau,et al.  De novo prediction of three-dimensional structures for major protein families. , 2002, Journal of molecular biology.

[17]  Lydia E. Kavraki,et al.  Understanding Protein Flexibility through Dimensionality Reduction , 2003, J. Comput. Biol..

[18]  S. Henikoff,et al.  Blocks database and its applications. , 1996, Methods in enzymology.

[19]  Vijay S Pande,et al.  Increased detection of structural templates using alignments of designed sequences , 2003, Proteins.

[20]  A. Sali,et al.  Comparative protein structure modeling of genes and genomes. , 2000, Annual review of biophysics and biomolecular structure.

[21]  Patrice Koehl,et al.  The ASTRAL Compendium in 2004 , 2003, Nucleic Acids Res..

[22]  A. Sali,et al.  Protein Structure Prediction and Structural Genomics , 2001, Science.

[23]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[24]  D. Baker,et al.  Design of a Novel Globular Protein Fold with Atomic-Level Accuracy , 2003, Science.

[25]  R Sánchez,et al.  Comparative protein structure modeling. Introduction and practical examples with modeller. , 2000, Methods in molecular biology.

[26]  Vijay S Pande,et al.  Thoroughly sampling sequence space: Large‐scale protein design of structural ensembles , 2002, Protein science : a publication of the Protein Society.

[27]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.