Enhanced Evolutionary and Heuristic Algorithms for Haplotype Reconstruction Problem Using Minimum Error Correction Model

Construction of two haplotypes from a set of Single Nucleotide Polymorphism (SNP) fragments is referred to as haplotype reconstruction problem. One of the most important computational models for this problem is Minimum Error Correction (MEC). Since MEC is an NP-hard problem, here we propose a heuristic algorithm for haplotype reconstruction problem. The algorithm is Particle Swarm Optimization (PSO) which is an evolutionary algorithm (EA). Evolutionary algorithms are stochastic search algorithms that imitate the natural biological evolution or the social behavior of species. In contrast to MEC model, our algorithm produces results in feasible time and it could be applied to large datasets. Our results suggest that the algorithm has less reconstruction error rate compared to other algorithms. This error is also very close to zero when the algorithm is applied to actual biological data. A comprehensive comparison between PSO and four famous algorithms in the literature is presented. A discussion on input parameters influencing reconstruction error rate is also presented.

[1]  Eugene W. Myers,et al.  A Dataset Generator for Whole Genome Shotgun Sequencing , 1999, ISMB.

[2]  Wei Zhang,et al.  Minimum Conflict Individual Haplotyping from SNP Fragments and Related Genotype , 2006, Evolutionary bioinformatics online.

[3]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[4]  Giuseppe Lancia,et al.  Practical Algorithms and Fixed-Parameter Tractability for the Single Individual SNP Haplotyping Problem , 2002, WABI.

[5]  Jong Hyun Kim,et al.  Haplotype Reconstruction from SNP Alignment , 2004, J. Comput. Biol..

[6]  James Kennedy,et al.  Particle swarm optimization , 2002, Proceedings of ICNN'95 - International Conference on Neural Networks.

[7]  M. Rieder,et al.  Sequence variation in the human angiotensin converting enzyme , 1999, Nature Genetics.

[8]  Giuseppe Lancia,et al.  Polynomial and APX-hard cases of the individual haplotyping problem , 2005, Theor. Comput. Sci..

[9]  Ying Wang,et al.  A clustering algorithm based on two distance functions for MEC model , 2007, Comput. Biol. Chem..

[10]  Yue Shi,et al.  A modified particle swarm optimizer , 1998, 1998 IEEE International Conference on Evolutionary Computation Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98TH8360).

[11]  Leo van Iersel,et al.  The Complexity of the Single Individual SNP Haplotyping Problem , 2005, Algorithmica.

[12]  Russell Schwartz,et al.  Algorithmic strategies for the single nucleotide polymorphism haplotype assembly problem , 2002, Briefings Bioinform..

[13]  K K Kidd,et al.  Sequence variability and candidate gene analysis in complex disease: association of mu opioid receptor gene variation with substance dependence. , 2000, Human molecular genetics.

[14]  Keinosuke Fukunaga,et al.  A Branch and Bound Clustering Algorithm , 1975, IEEE Transactions on Computers.

[15]  Alessandro Panconesi,et al.  Fast Hare: A Fast Heuristic for Single Individual SNP Haplotype Reconstruction , 2004, WABI.

[16]  Zbigniew Michalewicz,et al.  GENOCOP: a genetic algorithm for numerical optimization problems with linear constraints , 1996, CACM.

[17]  G. Lancia,et al.  Algorithmic Strategies for the SNP Haplotype Assembly Problem , 2002 .

[18]  Luonan Chen,et al.  Models and Algorithms for Haplotyping Problem , 2006 .

[19]  K. Weiss,et al.  Linkage disequilibrium mapping of complex disease: fantasy or reality? , 1998, Current opinion in biotechnology.

[20]  Russell Schwartz,et al.  SNPs Problems, Complexity, and Algorithms , 2001, ESA.

[21]  M. Daly,et al.  High-resolution haplotype structure in the human genome , 2001, Nature Genetics.

[22]  N. Freimer,et al.  Linkage-disequilibrium mapping of disease genes by reconstruction of ancestral haplotypes in founder populations. , 1999, American journal of human genetics.

[23]  Xiang-Sun Zhang,et al.  Haplotype reconstruction from SNP fragments by minimum error correction , 2005, Bioinform..

[24]  D. E. Goldberg,et al.  Genetic Algorithms in Search , 1989 .

[25]  J. Stephens,et al.  Haplotype Variation and Linkage Disequilibrium in 313 Human Genes , 2001, Science.