An Overview of Combinatorial Methods for Haplotype Inference

A current high-priority phase of human genomics involves the development of a full Haplotype Map of the human genome [23]. It will be used in large-scale screens of populations to associate specific haplotypes with specific complex genetic-influenced diseases. A key, perhaps bottleneck, problem is to computationally infer haplotype pairs from genotype data. This paper follows the talk given at the DIMACS Conference on SNPs and Haplotypes held in November of 2002. It reviews several combinatorial approaches to the haplotype inference problem that we have investigated over the last several years. In addition, it updates some of the work presented earlier, and discusses the current state of our work.

[1]  L. Helmuth Genome research: map of the human genome 3.0. , 2001, Science.

[2]  Dan Gusfield,et al.  Inference of Haplotypes from Samples of Diploid Populations: Complexity and Algorithms , 2001, J. Comput. Biol..

[3]  Dan Gusfield,et al.  Haplotype Inference by Pure Parsimony , 2003, CPM.

[4]  L. Helmuth Map of the Human Genome 3.0 , 2001, Science.

[5]  Andrew G. Clark,et al.  Computational Methods for SNPs and Haplotype Inference , 2002, Lecture Notes in Computer Science.

[6]  R. Karp,et al.  Efficient reconstruction of haplotype structure via perfect phylogeny. , 2002, Journal of bioinformatics and computational biology.

[7]  M. Daly,et al.  High-resolution haplotype structure in the human genome , 2001, Nature Genetics.

[8]  E. Boerwinkle,et al.  Apolipoprotein E variation at the sequence haplotype level: implications for the origin and maintenance of a major human polymorphism. , 2000, American journal of human genetics.

[9]  Fanica Gavril,et al.  An algorithm for constructing edge-trees from hypergraphs , 1983, Networks.

[10]  Dan Gusfield,et al.  Efficient algorithms for inferring evolutionary trees , 1991, Networks.

[11]  Giuseppe Lancia,et al.  Haplotyping Populations: Complexity and Approximations , 2002 .

[12]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[13]  Dan Gusfield Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[14]  Robert E. Bixby,et al.  An Almost Linear-Time Algorithm for Graph Realization , 1988, Math. Oper. Res..

[15]  Richard M. Karp,et al.  Large scale reconstruction of haplotypes from genotype data , 2003, RECOMB '03.

[16]  Dan Gusfield,et al.  Haplotyping as perfect phylogeny: conceptual framework and efficient solutions , 2002, RECOMB '02.

[17]  R. Hudson Gene genealogies and the coalescent process. , 1990 .

[18]  A. Chakravarti,et al.  Haplotype inference in random population samples. , 2002, American journal of human genetics.

[19]  Dan Gusfield,et al.  Empirical Exploration of Perfect Phylogeny Haplotyping and Haplotypers , 2003, COCOON.

[20]  Shibu Yooseph,et al.  Haplotyping as Perfect Phylogeny: A Direct Approach , 2003, J. Comput. Biol..

[21]  Dan Gusfield,et al.  The Fine Structure of Galls in Phylogenetic Networks , 2004, INFORMS J. Comput..

[22]  Dan Gusfield,et al.  Optimal, Efficient Reconstruction of Phylogenetic Networks with Constrained Recombination , 2004, J. Bioinform. Comput. Biol..

[23]  Dan Gusfield,et al.  Perfect phylogeny haplotyper: haplotype inferral using a tree model , 2003, Bioinform..

[24]  E. Boerwinkle,et al.  Haplotype structure and population genetic inferences from nucleotide-sequence variation in human lipoprotein lipase. , 1998, American journal of human genetics.

[25]  D. Gusfield,et al.  Analysis and exploration of the use of rule-based algorithms and consensus methods for the inferral of haplotypes. , 2003, Genetics.

[26]  Kenneth Steiglitz,et al.  Combinatorial Optimization: Algorithms and Complexity , 1981 .

[27]  V. Rich Personal communication , 1989, Nature.

[28]  Zhaohui S. Qin,et al.  Bayesian haplotype inference for multiple linked single-nucleotide polymorphisms. , 2002, American journal of human genetics.

[29]  Lusheng Wang,et al.  Haplotype inference by maximum parsimony , 2003, Bioinform..

[30]  Dan Gusfield,et al.  Efficient reconstruction of phylogenetic networks with constrained recombination , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[31]  A. Clark,et al.  Inference of haplotypes from PCR-amplified samples of diploid populations. , 1990, Molecular biology and evolution.

[32]  P. Donnelly,et al.  A new statistical method for haplotype reconstruction from population data. , 2001, American journal of human genetics.