With the launch of the international HapMap project, the haplotype inference problem has attracted a great deal of attention in the computational biology community recently. In this paper, we study the question of how to efficiently infer haplotypes from genotypes of individuals related by a pedigree without mating loops, assuming that the hereditary process was free of mutations (i.e. the Mendelian law of inheritance) and recombinants. We model the haplotype inference problem as a system of linear equations as in (Xiao et al. in Proceedings of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’07), pp. 655–664, 2007) and present an (optimal) linear-time (i.e.O(mn) time) algorithm to generate a particular solution to the haplotype inference problem, where m is the number of loci (or markers) in a genotype and n is the number of individuals in the pedigree. Moreover, the algorithm also provides a general solution in O(mn2) time, which is optimal because the descriptive size of a general solution could be as large as Θ(mn2). The key ingredients of our construction are (i) a fast consistency checking procedure for the system of linear equations introduced in (Xiao et al. 2007) based on a careful investigation of the relationship between the equations (ii) a novel linear-time method for solving linear equations without invoking the Gaussian elimination method. Although such a fast method for solving equations is not known for general systems of linear equations, we take advantage of the underlying loop-free pedigree graph and some special properties of the linear equations.
[1]
D. Qian,et al.
Minimum-recombinant haplotyping in pedigrees.
,
2002,
American journal of human genetics.
[2]
Jing Xiao,et al.
Fast elimination of redundant linear equations and reconstruction of recombination-free mendelian inheritance on a pedigree
,
2007,
SODA '07.
[3]
Ming-Yang Kao,et al.
Linear-Time Haplotype Inference on Pedigrees Without Recombinations
,
2006,
WABI.
[4]
G. Abecasis,et al.
Merlin—rapid analysis of dense genetic maps using sparse gene flow trees
,
2002,
Nature Genetics.
[5]
Tao Jiang,et al.
Computing the Minimum Recombinant Haplotype Configuration from Incomplete Genotype Data on a Pedigree by Integer Linear Programming
,
2005,
J. Comput. Biol..
[6]
Toshihiro Tanaka.
The International HapMap Project
,
2003,
Nature.
[7]
Tao Jiang,et al.
An exact solution for finding minimum recombinant haplotype configurations on pedigrees with missing data by integer linear programming
,
2004,
RECOMB.
[8]
M. Daly,et al.
High-resolution haplotype structure in the human genome
,
2001,
Nature Genetics.
[9]
J. O’Connell.
Zero‐recombinant haplotyping: Applications to fine mapping using SNPs
,
2000,
Genetic epidemiology.
[10]
Tao Jiang,et al.
Efficient rule-based haplotyping algorithms for pedigree data
,
2003,
RECOMB '03.