Optimized Evolutionary Strategies in Conformational Sampling

Novel genetic algorithm (GA)-based strategies, specifically aimed at multimodal optimization problems, have been developed by hybridizing the GA with alternative optimization heuristics, and used for the search of a maximal number of minimum energy conformations (geometries) of complex molecules (conformational sampling). Intramolecular energy, the targeted function, describes a very complex nonlinear response hypersurface in the phase space of structural degrees of freedom. These are the torsional angles controlling the relative rotation of fragments connected by covalent bonds. The energy surface of cyclodextrine, a macrocyclic sugar molecule with N = 65 degrees of freedom served as model system for testing and tuning the herein proposed multimodal optimization strategies. The success of GAs is known to depend on the peculiar hypotheses used to simulate Darwinian evolution. Therefore, the conformational sampling GA (CSGA) was designed such as to allow an extensive control on the evolution process by means of tunable parameters, some being classical GA controls (population size, mutation frequency, etc.), while others control the herein designed population diversity management tools or the frequencies of calls to the alternative heuristics. They form a large set of operational parameters, and a (genetic) meta-optimization procedure was used to search for parameter configurations maximizing the efficiency of the CSGA process. The specific impact of disabling a given hybridizing heuristics was estimated relatively to the default sampling behavior (with all the implemented heuristics on). Optimal sampling performance was obtained with a GA featuring a built-in tabu search mechanism, a “Lamarckian” (gradient-based) optimization tool, and, most notably, a “directed mutations” engine (a torsional angle driving procedure generating chromosomes that radically differ from their parents but have good chances to be “fit”, unlike offspring from spontaneous mutations). “Biasing” heuristics, implementing some more elaborated random draw distribution laws instead of the ‘flat’ default rule for torsional angle value picking, were at best unconvincing or outright harmful. Naive Bayesian analysis was employed in order to estimated the impact of the operational parameters on the CSGA success. The study emphasized the importance of proper tuning of the CSGA. The meta-optimization procedure implicitly ensures the management, in the context of an evolving operational parameterization, of the repeated GA runs that are absolutely mandatory for the reproducibility of the sampling of such vast phase spaces. Therefore, it should not be only seen as a tuning tool, but as the strategy for actual problem solving, essentially advocating a parallel exploration of problem space and parameter space.

[1]  I. Harvey,et al.  On recombination and optimal mutation rates , 1999 .

[2]  Pierre-Yves Calland On the structural complexity of a protein. , 2003, Protein engineering.

[3]  William M. Spears,et al.  Simple Subpopulation Schemes , 1998 .

[4]  K. Tai Conformational sampling for the impatient. , 2004, Biophysical chemistry.

[5]  Tom L Blundell,et al.  Advantages of fine-grained side chain conformer libraries. , 2003, Protein engineering.

[6]  C. Hunter,et al.  Sequence-structure relationships in DNA oligomers: a computational approach. , 2001, Journal of the American Chemical Society.

[7]  Virginie Tournay,et al.  La vie artificielle. : Entre vie naturelle et système technique , 2003 .

[8]  Viktor Hornak,et al.  Generation of accurate protein loop conformations through low‐barrier molecular dynamics , 2003, Proteins.

[9]  W. Dunn,et al.  Principal components analysis and partial least squares regression , 1989 .

[10]  William E. Hart,et al.  Optimizing an Arbitrary Function is Hard for the Genetic Algorithm , 1991 .

[11]  A. Gronenborn,et al.  Assessing the quality of solution nuclear magnetic resonance structures by complete cross-validation. , 1993, Science.

[12]  Zbigniew Michalewicz,et al.  Genetic algorithms + data structures = evolution programs (2nd, extended ed.) , 1994 .

[13]  Eiji Ōsawa,et al.  An efficient algorithm for searching low-energy conformers of cyclic and acyclic molecules , 1993 .

[14]  A T Brünger,et al.  Torsion-angle molecular dynamics as a new efficient tool for NMR structure calculation. , 1997, Journal of magnetic resonance.

[15]  Thomas Bäck,et al.  Evolutionary Algorithms in Theory and Practice , 1996 .

[16]  Manuel Davy,et al.  Méthodes monte carlo séquentielles pour l'analyse spectrale bayésienne , 2003 .

[17]  Dragos Horvath,et al.  Neighborhood Behavior of in Silico Structural Spaces with Respect to in Vitro Activity Spaces-A Novel Understanding of the Molecular Similarity Principle in the Context of Multiple Receptor Binding Profiles , 2003, J. Chem. Inf. Comput. Sci..

[18]  John J. Grefenstette,et al.  Optimization of Control Parameters for Genetic Algorithms , 1986, IEEE Transactions on Systems, Man, and Cybernetics.

[19]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[20]  S. Lifson,et al.  Energy functions for peptides and proteins. I. Derivation of a consistent force field including the hydrogen bond from amide crystals. , 1974, Journal of the American Chemical Society.

[21]  M. Ratner,et al.  Application of evolutionary algorithm methods to polypeptide folding: comparison with experimental results for unsolvated Ac-(Ala-Gly-Gly)5-LysH+. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[22]  David S. Goodsell,et al.  Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function , 1998 .

[23]  Xiaoyang Xia,et al.  Classification of kinase inhibitors using a Bayesian model. , 2004, Journal of medicinal chemistry.

[24]  Thomas Bäck,et al.  Evolutionary algorithms in theory and practice - evolution strategies, evolutionary programming, genetic algorithms , 1996 .

[25]  Francisco Herrera,et al.  Adaptive genetic operators based on coevolution with fuzzy behaviors , 2001, IEEE Trans. Evol. Comput..

[26]  Zbigniew Michalewicz,et al.  Genetic Algorithms + Data Structures = Evolution Programs , 1996, Springer Berlin Heidelberg.

[27]  F. A. Neugebauer,et al.  Electrochemical oxidation and structural changes of 5,6-dihydrobenzo[c]cinnolines , 1996 .

[28]  D Horvath,et al.  A virtual screening approach applied to the search for trypanothione reductase inhibitors. , 1997, Journal of medicinal chemistry.

[29]  Toshio Fukuda,et al.  Genetic algorithms with age structure , 1997, Soft Comput..

[30]  Fred Glover,et al.  Tabu Search - Part II , 1989, INFORMS J. Comput..

[31]  P. Moral,et al.  Sequential Monte Carlo samplers , 2002, cond-mat/0212648.