Genome Assembler for Repetitive Sequences

The article presents a new algorithm for random DNA fragment assembly. The algorithm uses an extended de Bruijn graph that stores information of reads coverage. It is able to reconstruct consecutive repetitive sequences longer than reads and to process large amount of data provided by new generation sequencers. Preliminary simulation results show the advantages of the method.

[1]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[2]  Ulrich Dobrindt,et al.  Role of pathogenicity island‐associated integrases in the genome plasticity of uropathogenic Escherichia coli strain 536 , 2006, Molecular microbiology.

[3]  P. Pevzner,et al.  An Eulerian path approach to DNA fragment assembly , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[4]  E. Lander,et al.  Genomic mapping by fingerprinting random clones: a mathematical analysis. , 1988, Genomics.

[5]  Eugene W. Myers,et al.  Toward Simplifying and Accurately Formulating Fragment Assembly , 1995, J. Comput. Biol..

[6]  F. Sanger,et al.  DNA sequencing with chain-terminating inhibitors. , 1977, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Hanlee P. Ji,et al.  Next-generation DNA sequencing , 2008, Nature Biotechnology.

[8]  Konrad H. Paszkiewicz,et al.  De novo assembly of short sequence reads , 2010, Briefings Bioinform..

[9]  P. Pevzner,et al.  De Novo Repeat Classification and Fragment Assembly , 2004 .

[10]  Bairong Shen,et al.  A Practical Comparison of De Novo Genome Assembly Software Tools for Next-Generation Sequencing Technologies , 2011, PloS one.

[11]  Eugene W. Myers,et al.  The fragment assembly string graph , 2005, ECCB/JBI.

[12]  B. Barrell,et al.  Life with 6000 Genes , 1996, Science.