Incorporation of Solvent Effect into Multi-Objective Evolutionary Algorithm for Improved Protein Structure Prediction

The problem of predicting the three-dimensional (3-D) structure of a protein from its one-dimensional sequence has been called the “holy grail of molecular biology”, and it has become an important part of structural genomics projects. Despite the rapid developments in computer technology and computational intelligence, it remains challenging and fascinating. In this paper, to solve it we propose a multi-objective evolutionary algorithm. We decompose the protein energy function Chemistry at HARvard Macromolecular Mechanics force fields into bond and non-bond energies as the first and second objectives. Considering the effect of solvent, we innovatively adopt a solvent-accessible surface area as the third objective. We use 66 benchmark proteins to verify the proposed method and obtain better or competitive results in comparison with the existing methods. The results suggest the necessity to incorporate the effect of solvent into a multi-objective evolutionary algorithm to improve protein structure prediction in terms of accuracy and efficiency.

[1]  David Baker,et al.  Protein Structure Prediction Using Rosetta , 2004, Numerical Computer Methods, Part D.

[2]  Adam Zemla,et al.  LGA: a method for finding 3D similarities in protein structures , 2003, Nucleic Acids Res..

[3]  Z. Luthey-Schulten,et al.  Ab initio protein structure prediction. , 2002, Current opinion in structural biology.

[4]  Marco Biasini,et al.  SWISS-MODEL: modelling protein tertiary and quaternary structure using evolutionary information , 2014, Nucleic Acids Res..

[5]  Satoru Miyano,et al.  Open source clustering software , 2004 .

[6]  Peter J. Fleming,et al.  Conflict, Harmony, and Independence: Relationships in Evolutionary Multi-criterion Optimisation , 2003, EMO.

[7]  MengChu Zhou,et al.  Multiobjective Optimization Models for Locating Vehicle Inspection Stations Subject to Stochastic Demand, Varying Velocity and Regional Constraints , 2016, IEEE Transactions on Intelligent Transportation Systems.

[8]  B. Lee,et al.  The interpretation of protein structures: estimation of static accessibility. , 1971, Journal of molecular biology.

[9]  Debashish Sahu,et al.  Bhageerath: an energy based web enabled computer software suite for limiting the search space of tertiary structures of small globular proteins , 2006, Nucleic acids research.

[10]  MengChu Zhou,et al.  Routing in Internet of Vehicles: A Review , 2015, IEEE Transactions on Intelligent Transportation Systems.

[11]  W. C. Still,et al.  The GB/SA Continuum Model for Solvation. A Fast Analytical Method for the Calculation of Approximate Born Radii , 1997 .

[12]  Márcio Dorn,et al.  CReF: a central-residue-fragment-based method for predicting approximate 3-D polypeptides structures , 2008, SAC '08.

[13]  Alexandre C. B. Delbem,et al.  Investigating relevant aspects of MOEAs for protein structures prediction , 2011, GECCO.

[14]  Yang Zhang,et al.  Generating Triangulated Macromolecular Surfaces by Euclidean Distance Transform , 2009, PloS one.

[15]  Keehyoung Joo,et al.  Contact‐assisted protein structure modeling by global optimization in CASP11 , 2016, Proteins.

[16]  W. Kabsch A solution for the best rotation to relate two sets of vectors , 1976 .

[17]  Giuseppe Nicosia,et al.  Computational energy-based redesign of robust proteins , 2011, Comput. Chem. Eng..

[18]  Julio Ortega Lopera,et al.  PITAGORAS-PSP: Including domain knowledge in a multi-objective approach for protein structure prediction , 2011, Neurocomputing.

[19]  Amarda Shehu,et al.  Multi-Objective Stochastic Search for Sampling Local Minima in the Protein Energy Surface , 2013, BCB.

[20]  Zibin Zheng,et al.  Multiobjective Vehicle Routing Problems With Simultaneous Delivery and Pickup and Time Windows: Formulation, Instances, and Algorithms , 2016, IEEE Transactions on Cybernetics.

[21]  Hongjun Bai,et al.  Assessment of template‐free modeling in CASP10 and ROLL , 2014, Proteins.

[22]  Sumaiya Iqbal,et al.  Discriminate protein decoys from native by using a scoring function based on ubiquitous Phi and Psi angles computed for all atom. , 2016, Journal of theoretical biology.

[23]  W. L. Jorgensen,et al.  Comparison of simple potential functions for simulating liquid water , 1983 .

[24]  Jianpeng Ma,et al.  CHARMM: The biomolecular simulation program , 2009, J. Comput. Chem..

[25]  Yang Zhang Progress and challenges in protein structure prediction. , 2008, Current opinion in structural biology.

[26]  Vincenzo Cutello,et al.  Computational Studies of Peptide and Protein Structure Prediction Problems via Multiobjective Evolutionary Algorithms , 2008, Multiobjective Problem Solving from Nature.

[27]  M. Karplus,et al.  Effective energy functions for protein structure prediction. , 2000, Current opinion in structural biology.

[28]  Luís C. Lamb,et al.  A hybrid genetic algorithm for the 3-D protein structure prediction problem using a path-relinking strategy , 2011, 2011 IEEE Congress of Evolutionary Computation (CEC).

[29]  David W. Corne,et al.  Approximating the Nondominated Front Using the Pareto Archived Evolution Strategy , 2000, Evolutionary Computation.

[30]  Sandra M. Venske,et al.  A Multiobjective Algorithm for Protein Structure Prediction Using Adaptive Differential Evolution , 2013, 2013 Brazilian Conference on Intelligent Systems.

[31]  MengChu Zhou,et al.  Pareto-Optimization for Scheduling of Crude Oil Operations in Refinery via Genetic Algorithm , 2017, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[32]  D. Eisenberg,et al.  Atomic solvation parameters applied to molecular dynamics of proteins in solution , 1992, Protein science : a publication of the Protein Society.

[33]  Yang Zhang Interplay of I‐TASSER and QUARK for template‐based and ab initio protein structure prediction in CASP10 , 2014, Proteins.

[34]  Carlos A. Coello Coello,et al.  Evolutionary multi-objective optimization: a historical view of the field , 2006, IEEE Comput. Intell. Mag..

[35]  Richard A. Watson,et al.  Reducing Local Optima in Single-Objective Problems by Multi-objectivization , 2001, EMO.

[36]  V. Cutello,et al.  A multi-objective evolutionary approach to the protein structure prediction problem , 2006, Journal of The Royal Society Interface.

[37]  Lukasz A. Kurgan,et al.  SPINE X: Improving protein secondary structure prediction by multistep learning coupled with prediction of solvent accessible surface area and backbone torsion angles , 2012, J. Comput. Chem..

[38]  Dong Xu,et al.  Protein Depth Calculation and the Use for Improving Accuracy of Protein Fold Recognition , 2013, J. Comput. Biol..

[39]  Douglas B. Kell,et al.  Multiobjective Optimization in Bioinformatics and Computational Biology , 2007, IEEE ACM Trans. Comput. Biol. Bioinform..

[40]  Michael Levitt,et al.  Generalized ensemble methods for de novo structure prediction , 2009, Proceedings of the National Academy of Sciences.

[41]  M. Karplus,et al.  Effective energy function for proteins in solution , 1999, Proteins.

[42]  H. Scheraga,et al.  Global optimization of clusters, crystals, and biomolecules. , 1999, Science.

[43]  H. Berman The Protein Data Bank: a historical perspective. , 2008, Acta crystallographica. Section A, Foundations of crystallography.

[44]  C. Levinthal Are there pathways for protein folding , 1968 .

[45]  M. Karplus,et al.  Harmonic dynamics of proteins: normal modes and fluctuations in bovine pancreatic trypsin inhibitor. , 1983, Proceedings of the National Academy of Sciences of the United States of America.

[46]  Jesús S. Aguilar-Ruiz,et al.  Soft computing methods for the prediction of protein tertiary structures: A survey , 2015, Appl. Soft Comput..

[47]  Samuel L. DeLuca,et al.  Practically Useful: What the Rosetta Protein Modeling Suite Can Do for You , 2010, Biochemistry.

[48]  Amarda Shehu,et al.  Probabilistic Search and Energy Guidance for Biased Decoy Sampling in Ab Initio Protein Structure Prediction , 2013, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[49]  Jianzhu Ma,et al.  Protein threading using context-specific alignment potential , 2013 .

[50]  Jooyoung Lee,et al.  De novo protein structure prediction by dynamic fragment assembly and conformational space annealing , 2011, Proteins.

[51]  A. Liwo,et al.  The protein folding problem: global optimization of the force fields. , 2004, Frontiers in bioscience : a journal and virtual library.

[52]  MengChu Zhou,et al.  Vehicle Scheduling of an Urban Bus Line via an Improved Multiobjective Genetic Algorithm , 2015, IEEE Transactions on Intelligent Transportation Systems.

[53]  Kresten Lindorff-Larsen,et al.  PHAISTOS: A framework for Markov chain Monte Carlo simulation and inference of protein structure , 2013, J. Comput. Chem..

[54]  K Murugesan,et al.  A multi-objective evolutionary algorithm for protein structure prediction with immune operators , 2009, Computer methods in biomechanics and biomedical engineering.

[55]  Alexander D. MacKerell,et al.  All-atom empirical potential for molecular modeling and dynamics studies of proteins. , 1998, The journal of physical chemistry. B.

[56]  Yang Zhang,et al.  The I-TASSER Suite: protein structure and function prediction , 2014, Nature Methods.

[57]  Marco Laumanns,et al.  Performance assessment of multiobjective optimizers: an analysis and review , 2003, IEEE Trans. Evol. Comput..

[58]  Anna Tramontano,et al.  Critical assessment of methods of protein structure prediction (CASP) — round x , 2014, Proteins.

[59]  Giuseppe Nicosia,et al.  Generalized Pattern Search and Mesh Adaptive Direct Search Algorithms for Protein Structure Prediction , 2007, WABI.

[60]  K. Dill,et al.  The Protein-Folding Problem, 50 Years On , 2012, Science.

[61]  Roland L. Dunbrack,et al.  Bayesian statistical analysis of protein side‐chain rotamer preferences , 1997, Protein science : a publication of the Protein Society.

[62]  P. Bradley,et al.  Toward High-Resolution de Novo Structure Prediction for Small Proteins , 2005, Science.

[63]  Mario Inostroza-Ponta,et al.  APL: An angle probability list to improve knowledge-based metaheuristics for the three-dimensional protein structure prediction , 2015, Comput. Biol. Chem..

[64]  David Becerra,et al.  A parallel multi-objective ab initio approach for protein structure prediction , 2010, 2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[65]  Shengyao Wang,et al.  A hybrid estimation of distribution algorithm for unrelated parallel machine scheduling with sequence-dependent setup times , 2016, IEEE/CAA Journal of Automatica Sinica.

[66]  P. Kollman,et al.  A Second Generation Force Field for the Simulation of Proteins, Nucleic Acids, and Organic Molecules , 1995 .

[67]  Sandra M. Venske,et al.  ADEMO/D: An adaptive differential evolution for protein structure prediction problem , 2016, Expert Syst. Appl..

[68]  Fionn Murtagh,et al.  A Survey of Recent Advances in Hierarchical Clustering Algorithms , 1983, Comput. J..

[69]  Joshua D. Knowles,et al.  Multiobjectivization by Decomposition of Scalar Cost Functions , 2008, PPSN.

[70]  Gregorio Toscano Pulido,et al.  Multi-objectivization, fitness landscape transformation and search performance: A case of study on the hp model for protein structure prediction , 2015, Eur. J. Oper. Res..

[71]  R. Nussinov,et al.  Folding funnels and binding mechanisms. , 1999, Protein engineering.

[72]  Hai-Peng Ren,et al.  Finding Robust Adaptation Gene Regulatory Networks Using Multi-Objective Genetic Algorithm , 2016, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[73]  Joshua D. Knowles,et al.  Investigations into the Effect of Multiobjectivization in Protein Structure Prediction , 2008, PPSN.

[74]  A. Liwo,et al.  Energy-based de novo protein folding by conformational space annealing and an off-lattice united-residue force field: application to the 10-55 fragment of staphylococcal protein A and to apo calbindin D9K. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[75]  Luís C. Lamb,et al.  Three-dimensional protein structure prediction: Methods and computational strategies , 2014, Comput. Biol. Chem..

[76]  C. Anfinsen Principles that govern the folding of protein chains. , 1973, Science.

[77]  Jian Cheng,et al.  Multi-Objective Particle Swarm Optimization Approach for Cost-Based Feature Selection in Classification , 2017, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[78]  Sergio A Hassan,et al.  Molecular dynamics simulations of peptides and proteins with a continuum electrostatic model based on screened Coulomb potentials , 2003, Proteins.

[79]  A. D. McLachlan,et al.  Solvation energy in protein folding and binding , 1986, Nature.

[80]  William E. Hart,et al.  Robust Proofs of NP-Hardness for Protein Folding: General Lattices and Energy Potentials , 1997, J. Comput. Biol..

[81]  A. Liwo,et al.  Ab initio simulations of protein-folding pathways by molecular dynamics with the united-residue model of polypeptide chains. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[82]  Dong Xu,et al.  Toward optimal fragment generations for ab initio protein structure assembly , 2013, Proteins.

[83]  Glennie Helles,et al.  A comparative study of the reported performance of ab initio protein structure prediction algorithms , 2008, Journal of The Royal Society Interface.

[84]  Yang Zhang,et al.  Ab initio protein structure assembly using continuous structure fragments and optimized knowledge‐based force field , 2012, Proteins.

[85]  R. Nussinov,et al.  The role of dynamic conformational ensembles in biomolecular recognition. , 2009, Nature chemical biology.

[86]  M Karplus,et al.  Enthalpic contribution to protein stability: insights from atom-based calculations and statistical mechanics. , 1995, Advances in protein chemistry.

[87]  Francisco Herrera,et al.  A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms , 2011, Swarm Evol. Comput..

[88]  David W. Corne,et al.  Use of a novel Hill-climbing genetic algorithm in protein folding simulations , 2003, Comput. Biol. Chem..

[89]  Vincenzo Cutello,et al.  Determination of protein structure and dynamics combining immune algorithms and pattern search methods , 2006, Natural Computing.