SADE-SPL: A Self-Adapting Differential Evolution algorithm with a loop Structure Pattern Library for the PSP problem

The knowledge about the conformation of a protein molecule allows the inference and study of its biological function. Because protein function is determined by its shape and the physio-chemical properties of its exposed surface, it is extremely important to predict accurate protein models. One of the hardest problems in Structural Bioinformatics is associated with the prediction of the three-dimensional structure of a protein only from its amino acid sequence (primary structure). Coils and turns are both elements of secondary structure in proteins where the polypeptide chain reverses its overall direction; These structures are considered the most difficult secondary structure to be predicted. In this paper, we propose a loop Structure Pattern Library (SPL) which was created using experimental information extracted from Protein Data Bank aiming to constrain the conformational search space of proteins. The Self-Adapting Differential Evolution (SADE) meta-heuristic was implemented for the tertiary protein structure prediction problem using the Structure Pattern Library as knowledge. The SADE algorithm was tested with (SADE-SPL) and without the Structure Pattern Library. Archived results show that the lowest Root Mean Square Deviation values were obtained when the Structure Pattern Library was employed. Average GDT TS were higher in all SADE-SPL cases. Thereby, our results allow us to state that SPL application knowledge in SADE meta-heuristic is capable of predicting three-dimensional protein structures closer to experimental structures than SADE application without SPL.

[1]  Mario Inostroza-Ponta,et al.  A Memetic Algorithm for 3D Protein Structure Prediction Problem , 2018, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[2]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[3]  Jun Zhai,et al.  ArchPRED: a template based loop structure prediction server , 2006, Nucleic Acids Res..

[4]  Anna Tramontano,et al.  LoopIng: a template-based tool for predicting the structure of protein loops , 2015, Bioinform..

[5]  P. Güntert Automated NMR structure calculation with CYANA. , 2004, Methods in molecular biology.

[6]  Kenneth V. Price,et al.  An introduction to differential evolution , 1999 .

[7]  A. Lehninger Principles of Biochemistry , 1984 .

[8]  Dmitrij Frishman,et al.  STRIDE: a web server for secondary structure assignment from known atomic coordinates of proteins , 2004, Nucleic Acids Res..

[9]  Solution structure of BmBKTx1, a new BKCa1 channel blocker from the Chinese scorpion Buthus martensi Karsch. , 2004 .

[10]  Wuyuan Lu,et al.  Solution structure of BmBKTx1, a new BKCa1 channel blocker from the Chinese scorpion Buthus martensi Karsch. , 2004, Biochemistry.

[11]  Yaohang Li,et al.  Conformational Sampling in Template-Free Protein Loop Structure Modeling: An Overview , 2013, Computational and structural biotechnology journal.

[12]  J C Gluckman,et al.  Rational engineering of a miniprotein that reproduces the core of the CD4 site interacting with HIV-1 envelope glycoprotein. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Baldomero Oliva,et al.  ArchDB 2014: structural classification of loops in proteins , 2013, Nucleic Acids Res..

[14]  R. Storn,et al.  Differential Evolution - A simple and efficient adaptive scheme for global optimization over continuous spaces , 2004 .

[15]  M. Tyagi,et al.  Local Protein Structures , 2007 .

[16]  Mario Inostroza-Ponta,et al.  APL: An angle probability list to improve knowledge-based metaheuristics for the three-dimensional protein structure prediction , 2015, Comput. Biol. Chem..

[17]  François Stricher,et al.  BriX: a database of protein building blocks for structural analysis, modeling and design , 2010, Nucleic Acids Res..

[18]  Alexander S. Rose,et al.  SL2: an interactive webtool for modeling of missing segments in proteins , 2016, Nucleic Acids Res..

[19]  M. Glickman,et al.  DNA-damage-inducible 1 protein (Ddi1) contains an uncharacteristic ubiquitin-like domain that binds ubiquitin. , 2015, Structure.

[20]  G L Gilliland,et al.  Structural studies of the engrailed homeodomain , 1994, Protein science : a publication of the Protein Society.

[21]  Krzysztof Fidelis,et al.  CASP10 results compared to those of previous CASP experiments , 2014, Proteins.

[22]  A. Kai Qin,et al.  Self-adaptive differential evolution algorithm for numerical optimization , 2005, 2005 IEEE Congress on Evolutionary Computation.

[23]  Randy L. Haupt,et al.  Practical Genetic Algorithms , 1998 .

[24]  Christodoulos A. Floudas,et al.  Advances in protein structure prediction and de novo protein design : A review , 2006 .

[25]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[26]  G. N. Ramachandran,et al.  Conformation of polypeptides and proteins. , 1968, Advances in protein chemistry.

[27]  Mario Inostroza-Ponta,et al.  NIAS-Server: Neighbors Influence of Amino acids and Secondary Structures in Proteins , 2017, J. Comput. Biol..

[28]  S. Phillips,et al.  A high-resolution structure of the DNA-binding domain of AhrC, the arginine repressor/activator protein from Bacillus subtilis. , 2007, Acta crystallographica. Section F, Structural biology and crystallization communications.

[29]  A. Fersht,et al.  The helix–turn–helix motif as an ultrafast independently folding domain: The pathway of folding of Engrailed homeodomain , 2007, Proceedings of the National Academy of Sciences.

[30]  Yang Zhang,et al.  Scoring function for automated assessment of protein structure template quality , 2004, Proteins.

[31]  Lydia E. Kavraki,et al.  Modeling Structures and Motions of Loops in Protein Molecules , 2012, Entropy.

[32]  Arthur M. Lesk,et al.  Introduction to Protein Science: Architecture, Function, and Genomics , 2001 .

[33]  Luís C. Lamb,et al.  Three-dimensional protein structure prediction: Methods and computational strategies , 2014, Comput. Biol. Chem..

[34]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[35]  M. Teeter,et al.  Crystal Structure of Ser-22/Ile-25 Form Crambin Confirms Solvent, Side Chain Substate Correlations* , 1997, The Journal of Biological Chemistry.

[36]  J. Richardson,et al.  The anatomy and taxonomy of protein structure. , 1981, Advances in protein chemistry.

[37]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[38]  Sergey Lyskov,et al.  PyRosetta: a script-based interface for implementing molecular modeling algorithms using Rosetta , 2010, Bioinform..

[39]  T. Creighton,et al.  Protein Folding , 1992 .

[40]  P. N. Suganthan,et al.  Differential Evolution Algorithm With Strategy Adaptation for Global Numerical Optimization , 2009, IEEE Transactions on Evolutionary Computation.