A molecular dynamics and knowledge-based computational strategy to predict native-like structures of polypeptides

One of the main research problems in structural bioinformatics is the prediction of three-dimensional structures (3-D) of polypeptides or proteins. The current rate at which amino acid sequences are identified increases much faster than the 3-D protein structure determination by experimental methods, such as X-ray diffraction and NMR techniques. The determination of protein structures is both experimentally expensive and time consuming. Predicting the correct 3-D structure of a protein molecule is an intricate and arduous task. The protein structure prediction (PSP) problem is, in computational complexity theory, an NP-complete problem. In order to reduce computing time, current efforts have targeted hybridizations between ab initio and knowledge-based methods aiming at efficient prediction of the correct structure of polypeptides. In this article we present a hybrid method for the 3-D protein structure prediction problem. An artificial neural network knowledge-based method that predicts approximated 3-D protein structures is combined with an ab initio strategy. Molecular dynamics (MD) simulation is used to the refinement of the approximated 3-D protein structures. In the refinement step, global interactions between each pair of atoms in the molecule (including non-bond interactions) are evaluated. The developed MD protocol enables us to correct polypeptide torsion angles deviation from the predicted structures and improve their stereo-chemical quality. The obtained results shows that the time to predict native-like 3-D structures is considerably reduced. We test our computational strategy with four mini proteins whose sizes vary from 19 to 34 amino acid residues. The structures obtained at the end of 32.0nanoseconds (ns) of MD simulation were comparable topologically to their correspondent experimental structures.

[1]  C. Branden,et al.  Introduction to protein structure , 1991 .

[2]  Stephen Muggleton,et al.  Application of Inductive Logic Programming to Discover Rules Governing the Three-Dimensional Topology of Protein Structure , 1998, ILP.

[3]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[4]  J. Berg,et al.  Molecular dynamics simulations of biomolecules , 2002, Nature Structural Biology.

[5]  Hao Fan,et al.  Refinement of homology‐based protein structures by molecular dynamics simulation techniques , 2004, Protein science : a publication of the Protein Society.

[6]  M. Starovasnik,et al.  Structural mimicry of a native protein by a minimized binding domain. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[7]  D. T. Jones,et al.  A new approach to protein fold recognition , 1992, Nature.

[8]  David Baker,et al.  Protein Structure Prediction Using Rosetta , 2004, Numerical Computer Methods, Part D.

[9]  Kenneth M. Merz,et al.  The Protein Folding Problem and Tertiary Structure Prediction , 2012 .

[10]  Márcio Dorn,et al.  Mining the Protein Data Bank with CReF to predict approximate 3-D structures of polypeptides , 2010, Int. J. Data Min. Bioinform..

[11]  T. Darden,et al.  Particle mesh Ewald: An N⋅log(N) method for Ewald sums in large systems , 1993 .

[12]  Z. Luthey-Schulten,et al.  Ab initio protein structure prediction. , 2002, Current opinion in structural biology.

[13]  O. N. de Souza,et al.  Ab initio 3-D structure prediction of an artificially designed three-alpha-helix bundle via all-atom molecular dynamics simulations. , 2007, Genetics and molecular research : GMR.

[14]  R Sánchez,et al.  Advances in comparative protein-structure modelling. , 1997, Current opinion in structural biology.

[15]  D. Osguthorpe Ab initio protein folding. , 2000, Current opinion in structural biology.

[16]  S. Bryant,et al.  Statistics of sequence-structure threading. , 1995, Current opinion in structural biology.

[17]  G. W. Buchko,et al.  Conformation of two peptides corresponding to human apolipoprotein C-I residues 7-24 and 35-53 in the presence of sodium dodecyl sulfate by CD and NMR spectroscopy. , 1995, Biochemistry.

[18]  R. Klevit,et al.  Structures of DNA‐binding mutant zinc finger domains: Implications for DNA binding , 1993, Protein science : a publication of the Protein Society.

[19]  Arthur M. Lesk,et al.  Introduction to bioinformatics , 2002 .

[20]  Carsten Kutzner,et al.  GROMACS 4:  Algorithms for Highly Efficient, Load-Balanced, and Scalable Molecular Simulation. , 2008, Journal of chemical theory and computation.

[21]  R. Srinivasan,et al.  LINUS: A hierarchic procedure to predict the fold of a protein , 1995, Proteins.

[22]  William E. Hart,et al.  Robust Proofs of NP-Hardness for Protein Folding: General Lattices and Energy Potentials , 1997, J. Comput. Biol..

[23]  Márcio Dorn,et al.  A Hybrid Method for the Protein Structure Prediction Problem , 2008, BSB.

[24]  Richard Bonneau,et al.  Ab initio protein structure prediction: progress and prospects. , 2001, Annual review of biophysics and biomolecular structure.

[25]  C. Anfinsen,et al.  The kinetics of formation of native ribonuclease during oxidation of the reduced polypeptide chain. , 1961, Proceedings of the National Academy of Sciences of the United States of America.

[26]  J. Board,et al.  Ewald summation techniques in perspective: a survey , 1996 .

[27]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[28]  C. Levinthal Are there pathways for protein folding , 1968 .

[29]  J. Thornton,et al.  PROMOTIF—A program to identify and analyze structural motifs in proteins , 1996, Protein science : a publication of the Protein Society.

[30]  Christodoulos A. Floudas,et al.  Advances in protein structure prediction and de novo protein design : A review , 2006 .

[31]  Paul Robustelli,et al.  Determination of protein structures in the solid state from NMR chemical shifts. , 2008, Structure.

[32]  J. Thornton,et al.  PROCHECK: a program to check the stereochemical quality of protein structures , 1993 .

[33]  A. D. McLachlan,et al.  Rapid comparison of protein structures , 1982 .

[34]  Caleb Webber,et al.  SCANPS: a web server for iterative protein sequence database searching by dynamic programing, with display in a hierarchical SCOP browser , 2008, Nucleic Acids Res..

[35]  Gerrit Groenhof,et al.  GROMACS: Fast, flexible, and free , 2005, J. Comput. Chem..

[36]  Márcio Dorn,et al.  CReF: a central-residue-fragment-based method for predicting approximate 3-D polypeptides structures , 2008, SAC '08.

[37]  H. Berendsen,et al.  COMPUTER-SIMULATION OF MOLECULAR-DYNAMICS - METHODOLOGY, APPLICATIONS, AND PERSPECTIVES IN CHEMISTRY , 1990 .

[38]  D. Frenkel,et al.  Molecular dynamics simulations. , 2002, Current opinion in structural biology.

[39]  Mihalis Yannakakis,et al.  On the Complexity of Protein Folding , 1998, J. Comput. Biol..

[40]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[41]  J. Gesell,et al.  Structures of the M2 channel-lining segments from nicotinic acetylcholine and NMDA receptors by NMR spectroscopy , 1999, Nature Structural Biology.

[42]  A. Sali,et al.  Comparative protein structure modeling of genes and genomes. , 2000, Annual review of biophysics and biomolecular structure.

[43]  G. N. Ramachandran,et al.  Conformation of polypeptides and proteins. , 1968, Advances in protein chemistry.

[44]  M J Sternberg,et al.  Progress in protein structure prediction: assessment of CASP3. , 1999, Current opinion in structural biology.