Recent improvements in prediction of protein structure by global optimization of a potential energy function

Recent improvements of a hierarchical ab initio or de novo approach for predicting both α and β structures of proteins are described. The united-residue energy function used in this procedure includes multibody interactions from a cumulant expansion of the free energy of polypeptide chains, with their relative weights determined by Z-score optimization. The critical initial stage of the hierarchical procedure involves a search of conformational space by the conformational space annealing (CSA) method, followed by optimization of an all-atom model. The procedure was assessed in a recent blind test of protein structure prediction (CASP4). The resulting lowest-energy structures of the target proteins (ranging in size from 70 to 244 residues) agreed with the experimental structures in many respects. The entire experimental structure of a cyclic α-helical protein of 70 residues was predicted to within 4.3 Å α-carbon (Cα) rms deviation (rmsd) whereas, for other α-helical proteins, fragments of roughly 60 residues were predicted to within 6.0 Å Cα rmsd. Whereas β structures can now be predicted with the new procedure, the success rate for α/β- and β-proteins is lower than that for α-proteins at present. For the β portions of α/β structures, the Cα rmsd's are less than 6.0 Å for contiguous fragments of 30–40 residues; for one target, three fragments (of length 10, 23, and 28 residues, respectively) formed a compact part of the tertiary structure with a Cα rmsd less than 6.0 Å. Overall, these results constitute an important step toward the ab initio prediction of protein structure solely from the amino acid sequence.

[1]  Harold A. Scheraga,et al.  An approximate treatment of long-range interactions in proteins , 1977 .

[2]  Manfred J. Sippl,et al.  Boltzmann's principle, knowledge-based mean fields and protein folding. An approach to the computational determination of protein structures , 1993, J. Comput. Aided Mol. Des..

[3]  M. Maqueda,et al.  Bacteriocin AS-48, a microbial cyclic polypeptide structurally and functionally related to mammalian NK-lysin. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[4]  H A Scheraga,et al.  Recent developments in the theory of protein folding: searching for the global energy minimum. , 1996, Biophysical chemistry.

[5]  C. Orengo,et al.  Analysis and assessment of ab initio three‐dimensional prediction, secondary structure, and contacts prediction , 1999, Proteins.

[6]  I. Shimada,et al.  Three-dimensional solution structure of the B domain of staphylococcal protein A: comparisons of the solution and crystal structures. , 1992, Biochemistry.

[7]  M J Rooman,et al.  Extracting information on folding from the amino acid sequence: consensus regions with preferred conformation in homologous proteins. , 1992, Biochemistry.

[8]  D A Clark,et al.  Protein topology prediction through constraint-based search and the evaluation of topological folding rules. , 1991, Protein engineering.

[9]  H A Scheraga,et al.  New developments of the electrostatically driven Monte Carlo method: test on the membrane-bound portion of melittin. , 1998, Biopolymers.

[10]  Satoru Miyano,et al.  RECOMB 2000 : proceedings of the Fourth annual international conference on computational molecular biology : April 8-11, 2000, Tokyo, Japan , 2000 .

[11]  T. A. Jones,et al.  Using known substructures in protein model building and crystallography. , 1986, The EMBO journal.

[12]  H. Scheraga,et al.  Energy parameters in polypeptides. 10. Improved geometrical parameters and nonbonded interactions for use in the ECEPP/3 algorithm, with application to proline-containing peptides , 1994 .

[13]  G M Crippen Easily searched protein folding potentials. , 1996, Journal of molecular biology.

[14]  J. Skolnick,et al.  What is the probability of a chance prediction of a protein structure with an rmsd of 6 A? , 1998, Folding & design.

[15]  C. Tanford Macromolecules , 1994, Nature.

[16]  Adam Liwo,et al.  Prediction of protein structure using a knowledge-based off-lattice united-residue force field and global optimization methods , 1999 .

[17]  Adam Godzik,et al.  De novo and inverse folding predictions of protein structure and dynamics , 1993, J. Comput. Aided Mol. Des..

[18]  H. Scheraga,et al.  On the multiple‐minima problem in the conformational analysis of polypeptides. II. An electrostatically driven Monte Carlo method—tests on poly(L‐alanine) , 1988, Biopolymers.

[19]  D Fischer,et al.  Assigning amino acid sequences to 3‐dimensional protein folds , 1996, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[20]  S. L. Mayo,et al.  De novo protein design: fully automated sequence selection. , 1997, Science.

[21]  S. Rackovsky,et al.  Calculation of protein backbone geometry from α‐carbon coordinates based on peptide‐group dipole alignment , 1993, Protein science : a publication of the Protein Society.

[22]  David C. Jones,et al.  Progress in protein structure prediction. , 1997, Current opinion in structural biology.

[23]  H A Scheraga,et al.  Low-energy structures of two dipeptides and their relationship to bend conformations. , 1974, Macromolecules.

[24]  B. Berne Modification of the overlap potential to mimic a linear site-site potential , 1981 .

[25]  R. Kubo GENERALIZED CUMULANT EXPANSION METHOD , 1962 .

[26]  New optimization method for conformational energy calculations on polypeptides: Conformational space annealing , 1997 .

[27]  M. Levitt,et al.  Computer simulation of protein folding , 1975, Nature.

[28]  John P. Overington,et al.  Alignment and searching for common protein folds using a data bank of structural templates. , 1993, Journal of molecular biology.

[29]  A. Fersht Structure and mechanism in protein science , 1998 .

[30]  L. Mirny,et al.  Protein structure prediction by threading. Why it works and why it does not. , 1998, Journal of molecular biology.

[31]  A. Liwo,et al.  Protein structure prediction by global optimization of a potential energy function. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[32]  N. Oppenheimer,et al.  Structure and mechanism , 1989 .

[33]  Tanja Kortemme,et al.  Design of a 20-Amino Acid, Three-Stranded β-Sheet Protein , 1998 .

[34]  Harold A. Scheraga,et al.  Some approaches to the multiple‐minima problem in the calculation of polypeptide and protein structures , 1992 .

[35]  A. Liwo,et al.  Energy-based de novo protein folding by conformational space annealing and an off-lattice united-residue force field: application to the 10-55 fragment of staphylococcal protein A and to apo calbindin D9K. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[36]  C. Anfinsen Principles that govern the folding of protein chains. , 1973, Science.

[37]  P K Warme,et al.  Computation of structures of homologous proteins. Alpha-lactalbumin from lysozyme. , 1974, Biochemistry.

[38]  A. Liwo,et al.  A united‐residue force field for off‐lattice protein‐structure simulations. I. Functional forms and parameters of long‐range side‐chain interaction potentials from protein crystal data , 1997 .

[39]  F. Young Biochemistry , 1955, The Indian Medical Gazette.

[40]  Harold A. Scheraga,et al.  Conformational space annealing by parallel computations: Extensive conformational search of Met‐enkephalin and of the 20‐residue membrane‐bound portion of melittin , 1999 .