Determination of protein structure and dynamics combining immune algorithms and pattern search methods

Natural proteins quickly fold into a complicated three-dimensional structure. Evolutionary algorithms have been used to predict the native structure with the lowest energy conformation of the primary sequence of a given protein. Successful structure prediction requires a free energy function sufficiently close to the true potential for the native state, as well as a method for exploring the conformational space. Protein structure prediction is a challenging problem because current potential functions have limited accuracy and the conformational space is vast. In this work, we show an innovative approach to the protein folding (PF) problem based on an hybrid Immune Algorithm (IMMALG) and a quasi-Newton method starting from a population of promising protein conformations created by the global optimizer DIRECT. The new method has been tested on Met-Enkephelin peptide, which is a paradigmatic example of multiple–minima problem, 1POLY, 1ROP and the three helix protein 1BDC. DIRECT produces an initial population of promising candidate solutions within a potentially optimal rectangle for the funnel landscape of the PF problem. Hence, IMMALG starts from a population of promising protein conformations created by the global optimizer DIRECT. The experimental results show that such a multistage approach is a competitive and effective search method in the conformational search space of real proteins, in terms of solution quality and computational cost comparing the results of the current state-of-art algorithms.

[1]  Owen J. Eslinger,et al.  Algorithms for Noisy Problems in Gas Transmission Pipeline Optimization , 2001 .

[2]  D. Eisenberg,et al.  Protein function in the post-genomic era , 2000, Nature.

[3]  Alexander D. MacKerell,et al.  CHARMM: The Energy Function and Its Parameterization , 2002 .

[4]  E. Huang,et al.  Ab initio fold prediction of small helical proteins using distance geometry and knowledge-based scoring functions. , 1999, Journal of molecular biology.

[5]  Anna Tramontano Protein Structure Prediction: Concepts and Applications , 2006 .

[6]  Pierre Baldi,et al.  The Principled Design of Large-Scale Recursive Neural Network Architectures--DAG-RNNs and the Protein Structure Prediction Problem , 2003, J. Mach. Learn. Res..

[7]  Roland L. Dunbrack,et al.  Bayesian statistical analysis of protein side‐chain rotamer preferences , 1997, Protein science : a publication of the Protein Society.

[8]  F. Burnet The clonal selection theory of acquired immunity , 1959 .

[9]  Gary B. Lamont,et al.  Polypeptide structure prediction: real-value versus binary hybrid genetic algorithms , 1997, SAC '97.

[10]  M. Karplus,et al.  CHARMM: A program for macromolecular energy, minimization, and dynamics calculations , 1983 .

[11]  Harold A. Scheraga,et al.  Structure and free energy of complex thermodynamic systems , 1988 .

[12]  M. Levitt Protein folding by restrained energy minimization and molecular dynamics. , 1983, Journal of molecular biology.

[13]  D. Eisenberg,et al.  An evolutionary approach to folding small alpha-helical proteins that uses sequence information and an empirical guiding fitness function. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[14]  H A Scheraga,et al.  An approach to the multiple-minima problem in protein folding by relaxing dimensionality. Tests on enkephalin. , 1987, Journal of molecular biology.

[15]  David B. Bogy,et al.  Direct algorithm and its application to slider air bearing surface optimization , 2002 .

[16]  A. D. McLachlan,et al.  Rapid comparison of protein structures , 1982 .

[17]  Jorge Nocedal,et al.  Representations of quasi-Newton matrices and their use in limited memory methods , 1994, Math. Program..

[18]  C. T. Kelley,et al.  A Locally-Biased form of the DIRECT Algorithm , 2001, J. Glob. Optim..

[19]  C Kooperberg,et al.  Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. , 1997, Journal of molecular biology.

[20]  Stefan Kuhr,et al.  Department of Mathematics and Computer Science , 2002 .

[21]  Eckart Bindewald,et al.  Implementing Genetic Algorithms with Sterical Constraints for Protein Structure Prediction , 1998, PPSN.

[22]  Yuko Okamoto,et al.  Numerical comparisons of three recently proposed algorithms in the protein folding problem , 1997, J. Comput. Chem..

[23]  John E. Dennis,et al.  Numerical methods for unconstrained optimization and nonlinear equations , 1983, Prentice Hall series in computational mathematics.

[24]  C. D. Perttunen,et al.  Lipschitzian optimization without the Lipschitz constant , 1993 .

[25]  D. Baker,et al.  Prospects for ab initio protein structural genomics. , 2001, Journal of molecular biology.

[26]  D. Finkel,et al.  Direct optimization algorithm user guide , 2003 .

[27]  Simon P. Wilson,et al.  Using DIRECT to Solve an Aircraft Routing Problem , 2002, Comput. Optim. Appl..

[28]  Vincenzo Cutello,et al.  Exploring the Capability of Immune Algorithms: A Characterization of Hypermutation Operators , 2004, ICARIS.