Haplotype Reconstruction in Large Pedigrees with Untyped Individuals through IBD Inference

UNLABELLED Haplotypes, as they specify the linkage patterns between dispersed genetic variations, provide important information for understanding the genetics of human traits. However, haplotypes are not directly obtainable from current genotyping platforms, which pushes extensive investigations of computational methods to recover such information. Two major computational challenges arising in current family-based disease studies are large family sizes and many ungenotyped family members. Traditional haplotyping methods can neither handle large families nor families with missing members. In this article, we propose a method that addresses these issues by integrating multiple novel techniques. The method consists of three major components: pairwise identical-by-descent (IBD) inference, global IBD reconstruction, and haplotype restoring. By reconstructing the global IBD of a family from pairwise IBD and then restoring the haplotypes based on the inferred IBD, this method can scale to large pedigrees, and more importantly it can handle families with missing members. Compared with existing approaches, this method demonstrates much higher power to recover haplotype information, especially in families with many untyped individuals. AVAILABILITY http://sites.google.com/site/xinlishomepage/pedibd.

[1]  Tao Jiang,et al.  Efficient Inference of Haplotypes from Genotypes on a Pedigree , 2003, J. Bioinform. Comput. Biol..

[2]  G. Abecasis,et al.  Merlin—rapid analysis of dense genetic maps using sparse gene flow trees , 2002, Nature Genetics.

[3]  Anna Ingolfsdottir,et al.  Allegro version 2 , 2005, Nature Genetics.

[4]  E. Lander,et al.  Construction of multilocus genetic linkage maps in humans. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Zhaohui S. Qin,et al.  A second generation human haplotype map of over 3.1 million SNPs , 2007, Nature.

[6]  N. Kaplan,et al.  On the advantage of haplotype analysis in the presence of multiple disease susceptibility alleles , 2002, Genetic epidemiology.

[7]  K Lange,et al.  Descent graphs in pedigree analysis: applications to haplotyping, location scores, and marker-sharing statistics. , 1996, American journal of human genetics.

[8]  L Kruglyak,et al.  Parametric and nonparametric linkage analysis: a unified multipoint approach. , 1996, American journal of human genetics.

[9]  Xin Li,et al.  An Almost Linear Time Algorithm for a General Haplotype Solution on Tree Pedigrees with no Recombination and its Extensions , 2009, J. Bioinform. Comput. Biol..

[10]  J. Bader The relative power of SNPs and haplotype as genetic markers for association tests. , 2001, Pharmacogenomics.

[11]  Xin Li,et al.  Efficient identification of identical-by-descent status in pedigrees with many untyped individuals , 2010, Bioinform..

[12]  Jing Xiao,et al.  Fast elimination of redundant linear equations and reconstruction of recombination-free mendelian inheritance on a pedigree , 2007, SODA '07.