Polyploidy, the presence of more than two copies of each chromosome in the cells of an organism, is common in plants and animals, and finds important applications in the field of genetics. To understand structure of each chromosome using Next Generation Sequencing (NGS), haplotype assembly is needed.We propose HapColor, a fragment partitioning approach, based on a new conflict graph model. We introduce a graph coloring algorithm followed by a color merging method to accurately group DNA short reads into any number of partitions depending on the ploidy level of the organism from which the sequencing data are derived. We compare HapColor with HapTree (a recently introduced polyploidy haplotyping), PGreedy (a polyploidy haplotyping that we develop based on Levy's well-known greedy algorithm) and RFP (a baseline random fragment partitioning method). Our analysis on Triploid, Tetraploid, Hexaploid, and Decaploid datasets demonstrate that HapColor substantially improves haplotype assembly accuracy of the other algorithms. The amount of improvement ranges from 25% to 90% depending on the ploidy level.
[1]
Vineet Bafna,et al.
HapCUT: an efficient and accurate algorithm for the haplotype assembly problem
,
2008,
ECCB.
[2]
Joachim Selbig,et al.
Haplotype inference from unphased SNP data in heterozygous polyploids based on SAT
,
2008,
BMC Genomics.
[3]
Timothy B. Stockwell,et al.
The Diploid Genome Sequence of an Individual Human
,
2007,
PLoS biology.
[4]
Daniel Brélaz,et al.
New methods to color the vertices of a graph
,
1979,
CACM.
[5]
Sorin Istrail,et al.
Haplotype assembly in polyploid genomes and identical by descent shared tracts
,
2013,
Bioinform..
[6]
Walter Klotz.
Graph Coloring Algorithms
,
2002
.
[7]
B. Browning,et al.
Haplotype phasing: existing methods and new developments
,
2011,
Nature Reviews Genetics.
[8]
Bonnie Berger,et al.
HapTree: A Novel Bayesian Framework for Single Individual Polyplotyping Using NGS Data
,
2014,
PLoS Comput. Biol..