De novo co-assembly of bacterial genomes from multiple single cells

Recent progress in DNA amplification techniques, particularly multiple displacement amplification (MDA), has made it possible to sequence and assemble bacterial genomes from a single cell. However, the quality of single cell genome assembly has not yet reached the quality of normal multiceli genome assembly due to the coverage bias and errors caused by MDA. Using a template of more than one cell for MDA or combining separate MDA products has been shown to improve the result of genome assembly from few single cells, but providing identical single cells, as a necessary step for these approaches, is a challenge. As a solution to this problem, we give an algorithm for de novo co-assembly of bacterial genomes from multiple single cells. Our novel method not only detects the outlier cells in a pool, it also identifies and eliminates their genomic sequences from the final assembly. Our proposed co-assembly algorithm is based on colored de Bruijn graph which has been recently proposed for de novo structural variation detection. Our results show that de novo co-assembly of bacterial genomes from multiple single cells outperforms single cell assembly of each individual one in all standard metrics. Moreover, co-assembly outperforms mixed assembly in which the input datasets are simply concatenated. We implemented our algorithm in a software tool called HyDA which is available from http://compbio.cs.wayne.edu/software/hyda.

[1]  F. Dean,et al.  Rapid amplification of plasmid and phage DNA using Phi 29 DNA polymerase and multiply-primed rolling circle amplification. , 2001, Genome research.

[2]  R. Fleischmann,et al.  Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. , 1995, Science.

[3]  Marcel J. T. Reinders,et al.  De novo detection of copy number variation by co-assembly , 2012, Bioinform..

[4]  Eugene W. Myers,et al.  The fragment assembly string graph , 2005, ECCB/JBI.

[5]  J. Troge,et al.  Tumour evolution inferred by single-cell sequencing , 2011, Nature.

[6]  Method of the Year 2013 , 2013, Nature Methods.

[7]  B. Berger,et al.  ARACHNE: a whole-genome shotgun assembler. , 2002, Genome research.

[8]  Tanja Woyke,et al.  Genomic sequencing of single microbial cells from environmental samples. , 2008, Current opinion in microbiology.

[9]  Nuno A. Fonseca,et al.  Assemblathon 1: a competitive assessment of de novo short read assembly methods. , 2011, Genome research.

[10]  N. Loman,et al.  High-throughput bacterial genome sequencing: an embarrassment of choice, a world of opportunity , 2012, Nature Reviews Microbiology.

[11]  Haixu Tang,et al.  De novo repeat classification and fragment assembly , 2004, RECOMB.

[12]  Hamidreza Chitsaz,et al.  Distilled single-cell genome sequencing and de novo assembly for sparse microbial communities , 2013, Bioinform..

[13]  Paul Pritchard On Computing the Subset Graph of a Collection of Sets , 1999, J. Algorithms.

[14]  Sergey I. Nikolenko,et al.  SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing , 2012, J. Comput. Biol..

[15]  R. Lasken,et al.  Genomic DNA Amplification from a Single Bacterium , 2005, Applied and Environmental Microbiology.

[16]  S. Kingsmore,et al.  Comprehensive human genome amplification using multiple displacement amplification , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[17]  M. Schatz,et al.  Algorithms Gage: a Critical Evaluation of Genome Assemblies and Assembly Material Supplemental , 2008 .

[18]  P. Pevzner,et al.  An Eulerian path approach to DNA fragment assembly , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Siu-Ming Yiu,et al.  IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth , 2012, Bioinform..

[20]  X. Xie,et al.  Genome-Wide Detection of Single-Nucleotide and Copy-Number Variations of a Single Human Cell , 2012, Science.

[21]  Hamidreza Chitsaz,et al.  Single-cell genome and metatranscriptome sequencing reveal metabolic interactions of an alkane-degrading methanogenic community , 2013, The ISME Journal.

[22]  Mark J. P. Chaisson,et al.  Short read fragment assembly of bacterial genomes. , 2008, Genome research.

[23]  C. Nusbaum,et al.  ALLPATHS: de novo assembly of whole-genome shotgun microreads. , 2008, Genome research.

[24]  Roger S Lasken,et al.  Whole genome amplification: abundant supplies of DNA from precious samples or clinical specimens. , 2003, Trends in biotechnology.

[25]  William A. Walters,et al.  Ultra-high-throughput microbial community analysis on the Illumina HiSeq and MiSeq platforms , 2012, The ISME Journal.

[26]  Daniel Pinkel,et al.  Whole genome analysis of genetic alterations in small DNA samples using hyperbranched strand displacement amplification and array-CGH. , 2003, Genome research.

[27]  Siu-Ming Yiu,et al.  IDBA - A Practical Iterative de Bruijn Graph De Novo Assembler , 2010, RECOMB.

[28]  Steven J. M. Jones,et al.  Abyss: a Parallel Assembler for Short Read Sequence Data Material Supplemental Open Access , 2022 .

[29]  Roger S Lasken,et al.  Single-cell genomic sequencing using Multiple Displacement Amplification. , 2007, Current opinion in microbiology.

[30]  Michael Roberts,et al.  Reducing storage requirements for biological sequence comparison , 2004, Bioinform..

[31]  Mihai Pop,et al.  Genome assembly reborn: recent computational challenges , 2009, Briefings Bioinform..

[32]  E. Birney,et al.  Velvet: algorithms for de novo short read assembly using de Bruijn graphs. , 2008, Genome research.

[33]  Mark J. P. Chaisson,et al.  De novo fragment assembly with short mate-paired reads: Does the read length matter? , 2009, Genome research.

[34]  P. Pevzner,et al.  Efficient de novo assembly of single-cell bacterial genomes from short-read data sets , 2011, Nature Biotechnology.

[35]  S. Linnarsson,et al.  Characterization of the single-cell transcriptional landscape by highly multiplex RNA-seq. , 2011, Genome research.

[36]  G. McVean,et al.  De novo assembly and genotyping of variants using colored de Bruijn graphs , 2011, Nature Genetics.