Genetic algorithm-based improved sampling for protein structure prediction

The quest for efficient sampling algorithms continues to be a demanding research topic due to their wide spread applications. Here, we present an extension of genetic algorithm (GA) to incorporate improved sampling capacity. We develop a fast-navigating genetic algorithm (FNGA) using associated-memory (AM)-based crossover operation which gives more trials with best chromosomes subpart and helps to navigate faster. To mitigate the increased similarity within population, the twin removal genetic algorithm or TRGA is applied. The optimally diverge chromosomes generated by TRGA can introduce potential subpart to enhance the performance of FNGA further. Thus, we combine FNGA and TRGA and named the combination, kite genetic algorithm (KGA). The proposed FNGA and KGA are empirically tested with benchmark functions and the results are found promising. We further employ KGA in the conformational search for the fragment-free protein tertiary structure prediction. The results of ab initio protein structure modelling show that the sampling performance of KGA is competitive.

[1]  Abdul Sattar,et al.  Refining Genetic Algorithm twin removal for high-resolution protein structure prediction , 2012, 2012 IEEE Congress on Evolutionary Computation.

[2]  David Baker,et al.  Macromolecular modeling with rosetta. , 2008, Annual review of biochemistry.

[3]  Xin Yao,et al.  Evolutionary programming made faster , 1999, IEEE Trans. Evol. Comput..

[4]  Kuldip K. Paliwal,et al.  Protein Structural Class Prediction via k-Separated Bigrams Using Position Specific Scoring Matrix , 2014, J. Adv. Comput. Intell. Intell. Informatics.

[5]  James G. Lyons,et al.  Advancing the Accuracy of Protein Fold Recognition by Utilizing Profiles From Hidden Markov Models , 2015, IEEE Transactions on NanoBioscience.

[6]  Andrew Lewis,et al.  Twin Removal in Genetic Algorithms for Protein Structure Prediction Using Low-Resolution Model , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[7]  V. K. Koumousis,et al.  A saw-tooth genetic algorithm combining the effects of variable population size and reinitialization to enhance performance , 2006, IEEE Transactions on Evolutionary Computation.

[8]  Abdul Sattar,et al.  An Enhanced Genetic Algorithm for Ab Initio Protein Structure Prediction , 2016, IEEE Transactions on Evolutionary Computation.

[9]  C. Levinthal Are there pathways for protein folding , 1968 .

[10]  Abdul Sattar,et al.  Guided macro-mutation in a graded energy based genetic algorithm for protein structure prediction , 2016, Comput. Biol. Chem..

[11]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[12]  R Unger,et al.  Genetic algorithms for protein folding simulations. , 1992, Journal of molecular biology.

[13]  Madhu Chetty,et al.  A new guided genetic algorithm for 2D hydrophobic-hydrophilic model to predict protein folding , 2005, 2005 IEEE Congress on Evolutionary Computation.

[14]  Madhu Chetty,et al.  Generalized Schemata Theorem Incorporating Twin Removal for Protein Structure Prediction , 2007, PRIB.

[15]  Dan Boneh,et al.  On genetic algorithms , 1995, COLT '95.

[16]  David E. Culler,et al.  Elapsed time on arrival: a simple and versatile primitive for canonical time synchronisation services , 2006, Int. J. Ad Hoc Ubiquitous Comput..

[17]  Kuldip K. Paliwal,et al.  A Tri-Gram Based Feature Extraction Technique Using Linear Probabilities of Position Specific Scoring Matrix for Protein Fold Recognition , 2014, IEEE Transactions on NanoBioscience.

[18]  Sue Whitesides,et al.  A complete and effective move set for simplified protein folding , 2003, RECOMB '03.

[19]  D. Karaboga,et al.  On the performance of artificial bee colony (ABC) algorithm , 2008, Appl. Soft Comput..

[20]  Goldberg,et al.  Genetic algorithms , 1993, Robust Control Systems with Genetic Algorithms.

[21]  M. Levitt,et al.  The complexity and accuracy of discrete state models of protein structure. , 1995, Journal of molecular biology.

[22]  Yuedong Yang,et al.  Predicting continuous local structure and the effect of its substitution for secondary structure in fragment-free protein structure prediction. , 2009, Structure.

[23]  K. Dill Theory for the folding and stability of globular proteins. , 1985, Biochemistry.

[24]  Sumaiya Iqbal,et al.  A balanced secondary structure predictor. , 2016, Journal of theoretical biology.

[25]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[26]  James G. Lyons,et al.  A feature extraction technique using bi-gram probabilities of position specific scoring matrix for protein fold recognition. , 2013, Journal of theoretical biology.

[27]  Abdollah Dehzangi,et al.  A Combination of Feature Extraction Methods with an Ensemble of Different Classifiers for Protein Structural Class Prediction Problem , 2013, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[28]  Ponnuthurai Nagaratnam Suganthan,et al.  Problem Definitions and Evaluation Criteria for the CEC 2014 Special Session and Competition on Single Objective Real-Parameter Numerical Optimization , 2014 .

[29]  William E. Hart,et al.  Protein Structure Prediction with Lattice Models , 2006 .

[30]  Xin-She Yang,et al.  A literature survey of benchmark functions for global optimisation problems , 2013, Int. J. Math. Model. Numer. Optimisation.

[31]  James G. Lyons,et al.  Protein fold recognition by alignment of amino acid residues using kernelized dynamic time warping. , 2014, Journal of theoretical biology.

[32]  Sumaiya Iqbal,et al.  Improved prediction of accessible surface area results in efficient energy function application. , 2015, Journal of theoretical biology.

[33]  Kuldip K. Paliwal,et al.  Exploring Potential Discriminatory Information Embedded in PSSM to Enhance Protein Structural Class Prediction Accuracy , 2013, PRIB.

[34]  Sumaiya Iqbal,et al.  Solving the multi-objective Vehicle Routing Problem with Soft Time Windows with the help of bees , 2015, Swarm Evol. Comput..

[35]  Adrien Treuille,et al.  Predicting protein structures with a multiplayer online game , 2010, Nature.

[36]  C. Anfinsen Principles that govern the folding of protein chains. , 1973, Science.

[37]  M. Tuba Artificial Bee Colony ( ABC ) Algorithm with Crossover and Mutation , 2012 .

[38]  David Baker,et al.  Protein Structure Prediction Using Rosetta , 2004, Numerical Computer Methods, Part D.