Heuristically Tuned GA to Solve Genome Fragment Assembly Problem

We proposed a genetic algorithm (GA) approach to solve the genome sequencing problem. The main contribution of this work is to add two ideas to improve the efficiency of the algorithm -(1) a chromosome reduction step (CRed) method to shorten the length of the chromosome and thereby the search-space, and (2) chromosome refinement step (CRef) is a greedy heuristics to locally improve the fitness of chromosomes. The algorithm will bring out longer and longer contigs with shorter and shorter gaps, as it continues running. At any stage the user can view the result, stop it when the output serves her/his purpose, or continue for getting longer contigs. We ran the proposed algorithm on part of the Wolbachia project work, and compared the results.

[1]  Mihai Pop,et al.  Genome Sequence Assembly: Algorithms and Issues , 2002, Computer.

[2]  David Corne,et al.  Evolutionary Computation In Bioinformatics , 2003 .

[3]  Timothy B. Stockwell,et al.  The Sequence of the Human Genome , 2001, Science.

[4]  B. Berger,et al.  ARACHNE: a whole-genome shotgun assembler. , 2002, Genome research.

[5]  D. Mccormick Sequence the Human Genome , 1986, Bio/Technology.

[6]  Sun Kim A Survey of Computational Techniques for Genome Sequencing , .

[7]  김삼묘,et al.  “Bioinformatics” 특집을 내면서 , 2000 .

[8]  João Meidanis,et al.  Introduction to computational molecular biology , 1997 .

[9]  Michael de la Maza,et al.  Book review: Genetic Algorithms + Data Structures = Evolution Programs by Zbigniew Michalewicz (Springer-Verlag, 1992) , 1993 .

[10]  D. Higgins,et al.  SAGA: sequence alignment by genetic algorithm. , 1996, Nucleic acids research.

[11]  S. L. Bonting,et al.  Life Sciences , 1984, Science.

[12]  R. Fleischmann,et al.  Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. , 1995, Science.

[13]  S. Kim,et al.  AMASS: A Structured Pattern Matching Approach to Shotgun Sequence Assembly , 1998, J. Comput. Biol..

[14]  Mihai Pop,et al.  Shotgun Sequence Assembly , 2004, Adv. Comput..

[15]  Stephanie Forrest,et al.  Genetic algorithms, operators, and DNA fragment assembly , 1995, Machine Learning.

[16]  M. J. Cunningham,et al.  Genomics and proteomics: the new millennium of drug discovery and development. , 2000, Journal of pharmacological and toxicological methods.

[17]  X. Huang,et al.  CAP3: A DNA sequence assembly program. , 1999, Genome research.

[18]  P. Green,et al.  Consed: a graphical tool for sequence finishing. , 1998, Genome research.

[19]  Yoshiaki Nagamura,et al.  The genome sequence of silkworm, Bombyx mori. , 2004, DNA research : an international journal for rapid publication of reports on genes and genomes.

[20]  Enrique Alba,et al.  Assembling DNA fragments with parallel algorithms , 2005, 2005 IEEE Congress on Evolutionary Computation.

[21]  P. Pevzner,et al.  An Eulerian path approach to DNA fragment assembly , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[22]  The Principles of Shotgun Sequencing and Automated Fragment Assembly , 2003 .

[23]  F. Sanger,et al.  Nucleotide sequence of bacteriophage lambda DNA. , 1982, Journal of molecular biology.

[24]  Kumar Chellapilla,et al.  Multiple sequence alignment using evolutionary programming , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).

[25]  Mark E. Johnson,et al.  DNA Sequence Assembly and Genetic Algorithms - New Results and Puzzling Insights , 1995, ISMB.

[26]  International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome , 2001, Nature.

[27]  Owen White,et al.  TIGR Assembler: A New Tool for Assembling Large Shotgun Sequencing Projects , 1995 .

[28]  K. Isono,et al.  Genome sequencing and analysis of Aspergillus oryzae , 2005, Nature.

[29]  Tom V. Mathew Genetic Algorithm , 2022 .

[30]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[31]  B. A. Pierce,et al.  Genetics: A Conceptual Approach , 2002 .

[32]  Eugene W. Myers,et al.  A whole-genome assembly of Drosophila. , 2000, Science.

[33]  Haixu Tang,et al.  A new approach to fragment assembly in DNA sequencing , 2001, RECOMB.

[34]  P. Vaidyanathan Genomics and proteomics: a signal processor's tour , 2004, IEEE Circuits and Systems Magazine.