Haplotype block partitioning and tagSNP selection under the perfect phylogeny model

Single Nucleotide Polymorphisms (SNPs) are the most usual form of polymorphism in human genome. Analyses of genetic variations have revealed that individual genomes share common SNP-haplotypes. The particular pattern of these common variations forms a block-like structure on human genome. In this work, we develop a new method based on the Perfect Phylogeny Model to identify haplotype blocks using samples of individual genomes. We introduce a rigorous definition of the quality of the partitioning of haplotypes into blocks and devise a greedy algorithm for finding the proper partitioning in case of perfect and semi-perfect phylogeny. It is shown that the minimum number of tagSNPs in a haplotype block of Perfect Phylogeny can be obtained by a polynomial time algorithm. We compare the performance of our algorithm on haplotype data of human chromosome 21 with other previously developed methods through simulations. The results demonstrate that our algorithm outperforms the conventional implementation of the Four Gamete Test approach which is the only available method for haplotype block partitioning based on

[1]  Kun-Mao Chao,et al.  A new framework for the selection of tag SNPs by multimarker haplotypes , 2008, J. Biomed. Informatics.

[2]  Roded Sharan,et al.  On the complexity of SNP block partitioning under the perfect phylogeny model , 2006, Discret. Math..

[3]  M. Daly,et al.  Haploview: analysis and visualization of LD and haplotype maps , 2005, Bioinform..

[4]  J. Akey,et al.  Distribution of recombination crossovers and the origin of haplotype blocks: the interplay of population history, recombination, and mutation. , 2002, American journal of human genetics.

[5]  Lon R. Cardon,et al.  A first-generation linkage disequilibrium map of human chromosome 22 , 2002, Nature.

[6]  S. Gabriel,et al.  The Structure of Haplotype Blocks in the Human Genome , 2002, Science.

[7]  M. Waterman,et al.  A dynamic programming algorithm for haplotype block partitioning , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[8]  S. P. Fodor,et al.  Blocks of Limited Haplotype Diversity Revealed by High-Resolution Scanning of Human Chromosome 21 , 2001, Science.

[9]  M. Daly,et al.  High-resolution haplotype structure in the human genome , 2001, Nature Genetics.

[10]  Frank Dudbridge,et al.  Haplotype tagging for the identification of common disease genes , 2001, Nature Genetics.

[11]  R. Hudson,et al.  Statistical properties of the number of recombination events in the history of a sample of DNA sequences. , 1985, Genetics.

[12]  G. Estabrook,et al.  An idealized concept of the true cladistic character , 1975 .

[13]  Hamid Pezeshk,et al.  BMC Bioinformatics BioMed Central Methodology article Global haplotype partitioning for maximal associated SNP pairs , 2009 .

[14]  BMC Bioinformatics Methodology article Approximation properties of haplotype tagging , 2006 .