Refined Genetic Algorithm Simulations to Model Proteins

Abstract The advent of completely sequenced genomes is leading to an unprecedented growth of sequence information while adequate structure information is often lacking. Genetic algorithm simulations have been refined and applied as a helpful tool for this question. Modified strategies are tested first on simple lattice protein models. This includes consideration of entropy (protein adjacent water shell) and improved search strategies (pioneer search +14%, systematic recombination +50% in search efficiency). Next, extension to grid free simulations of proteins in full main chain representation is examined. Our protein main chain simulations are further refined by independent criteria such as fitness per residue to judge predicted structures obtained at the end of a simulation. Protein families and protein interactions predicted from the complete H. pylori genomic sequence demonstrate how the full main chain simulations are then applied to model new protein sequences and protein families apparent from genome analysis.