deSPI: efficient classification of metagenomic reads with lightweight de Bruijn graph-based reference indexing

Summary In metagenomic studies, fast and effective tools are on wide demand to implement taxonomy classification for upto billions of reads. Herein, we propose deSPI, a novel read classification method that classifies reads by recognizing and analyzing the matches between reads and reference with de Bruijn graph-based lightweight reference indexing. deSPI has faster speed with relatively small memory footprint, meanwhile, it can also achieve higher or similar sensitivity and accuracy. Availability the C++ source code of deSPI is available at https://github.com/hitbc/deSPI Contact ydwang@hit.edu.cn Supplementary information Supplementary data are available at Bioinformatics online.

[1]  Melissa J. Landrum,et al.  RefSeq: an update on mammalian reference sequences , 2013, Nucleic Acids Res..

[2]  Alexander F. Auch,et al.  MEGAN analysis of metagenomic data. , 2007, Genome research.

[3]  S. Salzberg,et al.  Centrifuge: rapid and sensitive classification of metagenomic sequences , 2016, bioRxiv.

[4]  S. Lonardi,et al.  CLARK: fast and accurate classification of metagenomic and genomic sequences using discriminative k-mers , 2015, BMC Genomics.

[5]  Giovanni Manzini,et al.  Opportunistic data structures with applications , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[6]  Anders Krogh,et al.  Fast and sensitive taxonomic classification for metagenomics with Kaiju , 2016, Nature Communications.

[7]  Yeisoo Yu,et al.  Uncovering the novel characteristics of Asian honey bee, Apis cerana, by whole genome sequencing , 2015, BMC Genomics.

[8]  Derrick E. Wood,et al.  Kraken: ultrafast metagenomic sequence classification using exact alignments , 2014, Genome Biology.

[9]  Rob Knight,et al.  The Earth Microbiome project: successes and aspirations , 2014, BMC Biology.

[10]  Peng Cui,et al.  Dynamic regulation of genome-wide pre-mRNA splicing and stress tolerance by the Sm-like protein LSm5 in Arabidopsis , 2014, Genome Biology.

[11]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[12]  Katherine H. Huang,et al.  A framework for human microbiome research , 2012, Nature.