dipSPAdes: Assembler for Highly Polymorphic Diploid Genomes

While the number of sequenced diploid genomes of interest have been steadily increasing in the last few years, assembly of highly polymorphic HP diploid genomes remains challenging. As a result, there is shortage of tools for assembling HP genomes from NGS data. The initial approaches to assembling HP genomes were proposed in the pre-NGS era and are not well suited for NGS projects. We present the first de Bruijn graph assembler dipSPAdes for HP genomes and demonstrate that it significantly improves on the state-of-the-art in the HP genome assembly.

[1]  Alla Lapidus,et al.  ExSPAnder: a universal repeat resolver for DNA fragment assembly , 2014, Bioinform..

[2]  Paramvir S. Dehal,et al.  Whole-Genome Shotgun Assembly and Analysis of the Genome of Fugu rubripes , 2002, Science.

[3]  Pavel A Pevzner,et al.  How to apply de Bruijn graphs to genome assembly. , 2011, Nature biotechnology.

[4]  Eleazar Eskin,et al.  Optimal algorithms for haplotype assembly from whole-genome sequence data , 2010, Bioinform..

[5]  Steven Salzberg,et al.  GAGE-B: an evaluation of genome assemblers for bacterial organisms , 2013, Bioinform..

[6]  M. Schatz,et al.  Algorithms Gage: a Critical Evaluation of Genome Assemblies and Assembly Material Supplemental , 2008 .

[7]  P. Pevzner,et al.  An Eulerian path approach to DNA fragment assembly , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Cristel G. Thomas,et al.  Detecting heterozygosity in shotgun genome assemblies: Lessons from obligately outcrossing nematodes. , 2008, Genome research.

[9]  Vincent Lombard,et al.  Genome sequence of the model mushroom Schizophyllum commune , 2010, Nature Biotechnology.

[10]  Sergey I. Nikolenko,et al.  SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing , 2012, J. Comput. Biol..

[11]  B. Berger,et al.  ARACHNE: a whole-genome shotgun assembler. , 2002, Genome research.

[12]  Nilgun Donmez,et al.  Hapsembler: An Assembler for Highly Polymorphic Genomes , 2011, RECOMB.

[13]  Guangrui Huang,et al.  HaploMerger: Reconstructing allelic relationships for polymorphic diploid genome assemblies , 2012, Genome research.

[14]  Jianer Chen,et al.  A model of higher accuracy for the individual haplotyping problem based on weighted SNP fragments and genotype with errors , 2008, ISMB.

[15]  Jill P Mesirov,et al.  Assembly of polymorphic genomes: algorithms and application to Ciona savignyi. , 2005, Genome research.

[16]  A. Halpern,et al.  An MCMC algorithm for haplotype assembly from whole-genome sequence data. , 2008, Genome research.

[17]  F. Blattner,et al.  Mauve: multiple alignment of conserved genomic sequence with rearrangements. , 2004, Genome research.

[18]  Paul Richardson,et al.  The Draft Genome of Ciona intestinalis: Insights into Chordate and Vertebrate Origins , 2002, Science.

[19]  Sorin Istrail,et al.  HapCompass: A Fast Cycle Basis Algorithm for Accurate Haplotype Assembly of Sequence Data , 2012, J. Comput. Biol..

[20]  Alexey A. Gurevich,et al.  QUAST: quality assessment tool for genome assemblies , 2013, Bioinform..

[21]  E. Birney,et al.  Velvet: algorithms for de novo short read assembly using de Bruijn graphs. , 2008, Genome research.