Maximum Likelihood Haplotyping for General Pedigrees

Haplotype data is valuable in mapping disease-susceptibility genes in the study of Mendelian and complex diseases. We present algorithms for inferring a most likely haplotype configuration for general pedigrees, implemented in the newest version of the genetic linkage analysis system SUPERLINK. In SUPERLINK, genetic linkage analysis problems are represented internally using Bayesian networks. The use of Bayesian networks enables efficient maximum likelihood haplotyping for more complex pedigrees than was previously possible. Furthermore, to support efficient haplotyping for larger pedigrees, we have also incorporated a novel algorithm for determining a better elimination order for the variables of the Bayesian network. The presented optimization algorithm also improves likelihood computations. We present experimental results for the new algorithms on a variety of real and semiartificial data sets, and use our software to evaluate MCMC approximations for haplotyping.

[1]  K. Lange,et al.  An algorithm for automatic genotype elimination. , 1987, American journal of human genetics.

[2]  Eric S. Lander,et al.  Faster Multipoint Linkage Analysis Using Fourier Transforms , 1998, J. Comput. Biol..

[3]  Robert E. Tarjan,et al.  Simple Linear-Time Algorithms to Test Chordality of Graphs, Test Acyclicity of Hypergraphs, and Selectively Reduce Acyclic Hypergraphs , 1984, SIAM J. Comput..

[4]  Terence P. Speed,et al.  An algorithm for haplotype analysis , 1997, RECOMB '97.

[5]  R. Elston,et al.  A general model for the genetic analysis of pedigree data. , 1971, Human heredity.

[6]  K. Lange,et al.  Extensions to pedigree analysis. V. Optimal calculation of Mendelian likelihoods. , 1983, Human heredity.

[7]  A A Schäffer,et al.  Faster linkage analysis computations for pedigrees with loops or unused alleles. , 1996, Human heredity.

[8]  Stefan Arnborg,et al.  Efficient algorithms for combinatorial problems on graphs with bounded decomposability — A survey , 1985, BIT.

[9]  L Kruglyak,et al.  Parametric and nonparametric linkage analysis: a unified multipoint approach. , 1996, American journal of human genetics.

[10]  J. O’Connell,et al.  The VITESSE algorithm for rapid exact multilocus linkage analysis via genotype set–recoding and fuzzy inheritance , 1995, Nature Genetics.

[11]  D. Qian,et al.  Minimum-recombinant haplotyping in pedigrees. , 2002, American journal of human genetics.

[12]  J. Zlotogora,et al.  Localization of the Krabbe disease gene (GALC) on chromosome 14 by multipoint linkage analysis. , 1993, American journal of human genetics.

[13]  Dan Geiger,et al.  Optimizing exact genetic linkage computations , 2003, RECOMB '03.

[14]  E. Wijsman A deductive method of haplotype analysis in pedigrees. , 1987, American journal of human genetics.

[15]  Kristian G. Olesen,et al.  HUGIN - A Shell for Building Bayesian Belief Universes for Expert Systems , 1989, IJCAI.

[16]  K. Lange,et al.  Programs for pedigree analysis: Mendel, Fisher, and dGene , 1988, Genetic epidemiology.

[17]  J. Nutt,et al.  A gene for episodic ataxia/myokymia maps to chromosome 12p13. , 1994, American journal of human genetics.

[18]  E. Lander,et al.  Construction of multilocus genetic linkage maps in humans. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[19]  M. Spence,et al.  Analysis of human genetic linkage , 1986 .

[20]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[21]  P. Donnelly,et al.  A new statistical method for haplotype reconstruction from population data. , 2001, American journal of human genetics.

[22]  G. Abecasis,et al.  Merlin—rapid analysis of dense genetic maps using sparse gene flow trees , 2002, Nature Genetics.

[23]  Y. Yuval,et al.  Dominant inheritance in two families with familial Mediterranean fever (FMF). , 1995, American journal of medical genetics.

[24]  Dan Geiger,et al.  Model-Based Inference of Haplotype Block Variation , 2004, J. Comput. Biol..

[25]  Tao Jiang,et al.  PedPhase : Haplotype Inference for Pedigree Data , 2003 .

[26]  K Lange,et al.  Descent graphs in pedigree analysis: applications to haplotyping, location scores, and marker-sharing statistics. , 1996, American journal of human genetics.

[27]  Hans L. Bodlaender,et al.  Safe Reduction Rules for Weighted Treewidth , 2002, WG.

[28]  X H Zhou,et al.  A nonparametric maximum likelihood estimator for the receiver operating characteristic curve area in the presence of verification bias. , 1996, Biometrics.

[29]  Dan Gusfield,et al.  Haplotyping as perfect phylogeny: conceptual framework and efficient solutions , 2002, RECOMB '02.

[30]  Tao Jiang,et al.  An exact solution for finding minimum recombinant haplotype configurations on pedigrees with missing data by integer linear programming , 2004, RECOMB.

[31]  E. Thompson,et al.  Estimation of conditional multilocus gene identity among relatives , 1999 .

[32]  Rina Dechter,et al.  Bucket elimination: A unifying framework for probabilistic inference , 1996, UAI.

[33]  S Lin Multipoint linkage analysis via Metropolis jumping kernels. , 1996, Biometrics.

[34]  Derek G. Corneil,et al.  Complexity of finding embeddings in a k -tree , 1987 .

[35]  Daniel F. Gudbjartsson,et al.  Allegro, a new computer program for multipoint linkage analysis , 2000, Nature genetics.

[36]  Tao Jiang,et al.  Efficient rule-based haplotyping algorithms for pedigree data , 2003, RECOMB '03.

[37]  Robert C. Elston,et al.  Extensions to Pedigree Analysis , 1975 .

[38]  Michael I. Jordan Graphical Models , 1998 .

[39]  Dan Geiger,et al.  Exact genetic linkage computations for general pedigrees , 2002, ISMB.

[40]  E. Thompson Monte Carlo Likelihood in Genetic Mapping , 1994 .

[41]  W. Ewens,et al.  The transmission/disequilibrium test: history, subdivision, and admixture. , 1995, American journal of human genetics.

[42]  James D. Park,et al.  MAP Complexity Results and Approximation Methods , 2002, UAI.