Haplotype Reconstruction in Large Pedigrees with Many Untyped Individuals

Haplotypes, as they specify the linkage patterns between dispersed genetic variations, provide important information for understanding the genetics of human traits. However haplotypes are not directly available from current genotyping platforms, and hence there are extensive investigations of computational methods to recover such information. Two major computational challenges arising in current family-based disease studies are large family sizes and many ungenotyped family members. Traditional haplotyping methods can neither handle large families nor families with missing members. In this paper, we propose a method which addresses these issues by integrating multiple novel techniques. The method consists of three major components: pairwise identical-by-descent (IBD) inference, global IBD reconstruction and haplotype restoring. By reconstructing the global IBD of a family from pairwise IBD and then restoring the haplotypes based on the inferred IBD, this method can scale to large pedigrees, and more importantly it can handle families with missing members. Compared with existing methods, this method demonstrates much higher power to recover haplotype information, especially in families with many untyped individuals.

[1]  K Lange,et al.  Descent graphs in pedigree analysis: applications to haplotyping, location scores, and marker-sharing statistics. , 1996, American journal of human genetics.

[2]  Xin Li,et al.  Efficient identification of identical-by-descent status in pedigrees with many untyped individuals , 2010, Bioinform..

[3]  G. Abecasis,et al.  Merlin—rapid analysis of dense genetic maps using sparse gene flow trees , 2002, Nature Genetics.

[4]  Jing Xiao,et al.  Fast elimination of redundant linear equations and reconstruction of recombination-free mendelian inheritance on a pedigree , 2007, SODA '07.

[5]  Anna Ingolfsdottir,et al.  Allegro version 2 , 2005, Nature Genetics.

[6]  Zhaohui S. Qin,et al.  A second generation human haplotype map of over 3.1 million SNPs , 2007, Nature.

[7]  J. Bader The relative power of SNPs and haplotype as genetic markers for association tests. , 2001, Pharmacogenomics.

[8]  Tao Jiang,et al.  Efficient Inference of Haplotypes from Genotypes on a Pedigree , 2003, J. Bioinform. Comput. Biol..

[9]  E. Lander,et al.  Construction of multilocus genetic linkage maps in humans. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[10]  L Kruglyak,et al.  Parametric and nonparametric linkage analysis: a unified multipoint approach. , 1996, American journal of human genetics.

[11]  Xin Li,et al.  An Almost Linear Time Algorithm for a General Haplotype Solution on Tree Pedigrees with no Recombination and its Extensions , 2009, J. Bioinform. Comput. Biol..

[12]  N. Kaplan,et al.  On the advantage of haplotype analysis in the presence of multiple disease susceptibility alleles , 2002, Genetic epidemiology.