Next generation shotgun sequencing and the challenges of de novo genome assembly

Sequencing the genomes of the many scientifically fascinating plants and animals found in South Africa is fast becoming a viable option as a result of the rapid and sustained drop in the cost of next generation sequencing over the last five years. However, the processing and assembly of the sequence data produced is not trivial. There are several factors which need to be taken into consideration when planning a strategy to assemble genome sequence data de novo. This paper reviews the advances and the challenges in two of the most rapidly developing areas of the field: the sequencing technology and the software programs used to assemble de novo the sequence data generated by these technologies into a genome.

[1]  L. Stein The case for cloud computing in genome informatics , 2010, Genome Biology.

[2]  P. Pevzner,et al.  An Eulerian path approach to DNA fragment assembly , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Hanlee P. Ji,et al.  Next-generation DNA sequencing , 2008, Nature Biotechnology.

[4]  Henry D. Priest,et al.  The genome of woodland strawberry (Fragaria vesca) , 2011, Nature Genetics.

[5]  Huanming Yang,et al.  De novo assembly of human genomes with massively parallel short read sequencing. , 2010, Genome research.

[6]  Mihai Pop,et al.  Genome assembly reborn: recent computational challenges , 2009, Briefings Bioinform..

[7]  Steven J. M. Jones,et al.  Abyss: a Parallel Assembler for Short Read Sequence Data Material Supplemental Open Access , 2022 .

[8]  M. Fedurco,et al.  A new class of cleavable fluorescent nucleotides: synthesis and optimization as reversible terminators for DNA sequencing by synthesis† , 2008, Nucleic acids research.

[9]  Caspar Zialor DNA sequencing with chain terminating inhibitors , 2014 .

[10]  M. Metzker Sequencing technologies — the next generation , 2010, Nature Reviews Genetics.

[11]  Juliane C. Dohm,et al.  SHARCGS, a fast and highly accurate short-read assembly algorithm for de novo genomic sequencing. , 2007, Genome research.

[12]  David Hernández,et al.  De novo bacterial genome sequencing: millions of very short reads assembled on a desktop computer. , 2008, Genome research.

[13]  Nancy F. Hansen,et al.  Accurate Whole Human Genome Sequencing using Reversible Terminator Chemistry , 2008, Nature.

[14]  S. Salzberg,et al.  Bioinformatics challenges of new sequencing technology. , 2008, Trends in genetics : TIG.

[15]  R. Durbin,et al.  Efficient de novo assembly of large genomes using compressed data structures. , 2012, Genome research.

[16]  E. Birney,et al.  Velvet: algorithms for de novo short read assembly using de Bruijn graphs. , 2008, Genome research.

[17]  C. Nusbaum,et al.  ALLPATHS: de novo assembly of whole-genome shotgun microreads. , 2008, Genome research.

[18]  René L. Warren,et al.  Assembling millions of short DNA sequences using SSAKE , 2006, Bioinform..

[19]  Sergey Koren,et al.  Aggressive assembly of pyrosequencing reads with mates , 2008, Bioinform..

[20]  Vincent J. Magrini,et al.  Extending assembly of short DNA sequences to handle error , 2007, Bioinform..

[21]  James R. Knight,et al.  Genome sequencing in microfabricated high-density picolitre reactors , 2005, Nature.

[22]  K. Lieberman,et al.  Processive replication of single DNA molecules in a nanopore catalyzed by phi29 DNA polymerase. , 2010, Journal of the American Chemical Society.

[23]  T. Glenn Field guide to next‐generation DNA sequencers , 2011, Molecular ecology resources.

[24]  Nuno A. Fonseca,et al.  Assemblathon 1: a competitive assessment of de novo short read assembly methods. , 2011, Genome research.

[25]  S. Koren,et al.  Assembly algorithms for next-generation sequencing data. , 2010, Genomics.

[26]  Bairong Shen,et al.  A Practical Comparison of De Novo Genome Assembly Software Tools for Next-Generation Sequencing Technologies , 2011, PloS one.

[27]  E. Eichler,et al.  Limitations of next-generation genome sequence assembly , 2011, Nature Methods.

[28]  M. Fedurco,et al.  BTA, a novel reagent for DNA attachment on glass and efficient generation of solid-phase amplified DNA colonies , 2006, Nucleic acids research.