论文信息 - Portable nanopore analytics: are we there yet? - 字舞流文

Portable nanopore analytics: are we there yet?

MOTIVATION Oxford Nanopore technologies (ONT) add miniaturization and real-time to high-throughput sequencing. All available software for ONT data analytics run on cloud/clusters or personal computers. Instead, a linchpin to true portability is software that works on mobile devices of internet connections. Smartphones' and tablets' chipset/memory/operating systems differ from desktop computers, but software can be recompiled. We sought to understand how portable current ONT analysis methods are. RESULTS Several tools, from base-calling to genome assembly, were ported and benchmarked on an Android smartphone. Out of 23 programs, 11 succeeded. Recompilation failures included lack of standard headers and unsupported instruction sets. Only DSK, BCALM2 and Kraken were able to process files up to 16GB, with linearly scaling CPU-times. However, peak CPU temperatures were high. In conclusion, the portability scenario is not favorable. Given the fast market growth, attention of developers to ARM chipsets and Android/iOS is warranted, as well as initiatives to implement mobile-specific libraries.

Marco Oliva | Kaden King | Grace Benson | Christina Boucher | Mattia C. F. Prosperi | Inanc Birol | I. Birol | C. Boucher | M. Prosperi | F. Milicchio | Marco Oliva | Kaden King | Grace Benson

[1] K. Bibby,et al. Evaluation of Oxford Nanopore MinIONTM Sequencing for 16S rRNA Microbiome Characterization , 2017, bioRxiv.

[2] Richard Durbin,et al. Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[3] Franco Milicchio,et al. Third-generation sequencing data analytics on mobile devices: cache oblivious and out-of-core approaches as a proof-of-concept , 2018, FNC/MobiSPC.

[4] Oliver G. Pybus,et al. Mobile real-time surveillance of Zika virus in Brazil , 2016, Genome Medicine.

[5] Carlos de Lannoy,et al. The long reads ahead: de novo genome assembly using the MinION , 2017, F1000Research.

[6] Pavel A. Pevzner,et al. Assembly of long error-prone reads using de Bruijn graphs , 2016, Proceedings of the National Academy of Sciences.

[7] Hugh E. Olsen,et al. The Oxford Nanopore MinION: delivery of nanopore sequencing to the genomics community , 2016, Genome Biology.

[8] Maria Cristina Carile,et al. Enamel peptides reveal the sex of the Late Antique ‘Lovers of Modena’ , 2019, Scientific Reports.

[9] Michael A. Bender,et al. Mantis: A Fast, Small, and Exact Large-Scale Sequence-Search Index. , 2018, Cell systems.

[10] R. Durbin,et al. Efficient de novo assembly of large genomes using compressed data structures. , 2012, Genome research.

[11] Paul Medvedev,et al. Compacting de Bruijn graphs from sequencing data quickly and in low memory , 2016, Bioinform..

[12] Derrick E. Wood,et al. Kraken: ultrafast metagenomic sequence classification using exact alignments , 2014, Genome Biology.

[13] Michael Liem,et al. Rapid de novo assembly of the European eel genome from nanopore sequencing reads , 2017, Scientific Reports.

[14] Heng Li,et al. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data , 2011, Bioinform..

[15] Dominique Lavenier,et al. GATB: Genome Assembly & Analysis Tool Box , 2014, Bioinform..

[16] Chao Xie,et al. Fast and sensitive protein alignment using DIAMOND , 2014, Nature Methods.

[17] Matei David,et al. Nanocall: an open source basecaller for Oxford Nanopore sequencing data , 2016, bioRxiv.

[18] Sri Parameswaran,et al. Featherweight long read alignment using partitioned reference indexes , 2018, Scientific Reports.

[19] Christopher Wilks,et al. Scaling read aligners to hundreds of threads on general-purpose processors , 2017, bioRxiv.

[20] Heng Li,et al. Minimap2: pairwise alignment for nucleotide sequences , 2017, Bioinform..

[21] Nan Li,et al. Comparison of the two major classes of assembly algorithms: overlap-layout-consensus and de-bruijn-graph. , 2012, Briefings in functional genomics.

[22] Esko Ukkonen,et al. Accurate self-correction of errors in long reads using de Bruijn graphs , 2016, Bioinform..

[23] Knut Reinert,et al. Lambda: the local aligner for massive biological data , 2014, Bioinform..

[24] Steven L Salzberg,et al. Fast gapped-read alignment with Bowtie 2 , 2012, Nature Methods.

[25] J. McPherson,et al. Coming of age: ten years of next-generation sequencing technologies , 2016, Nature Reviews Genetics.

[26] S. Koren,et al. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation , 2016, bioRxiv.

[27] Brent S. Pedersen,et al. Nanopore sequencing and assembly of a human genome with ultra-long reads , 2017, Nature Biotechnology.

[28] Ilan Shomorony,et al. HINGE: Long-Read Assembly Achieves Optimal Repeat Resolution , 2016, bioRxiv.

[29] Dominique Lavenier,et al. DSK: k-mer counting with very low memory usage , 2013, Bioinform..

[30] Franco Milicchio,et al. Efficient data structures for mobile de novo genome assembly by third-generation sequencing , 2017, FNC/MobiSPC.

[31] Niranjan Nagarajan,et al. Fast and accurate de novo genome assembly from long uncorrected reads. , 2017, Genome research.

[32] Douglas J. Botkin,et al. Nanopore DNA Sequencing and Genome Assembly on the International Space Station , 2016, bioRxiv.

[33] Robert J. Fischer,et al. Nanopore Sequencing as a Rapidly Deployable Ebola Outbreak Tool , 2016, Emerging infectious diseases.

[34] Thomas C. Conway,et al. Succinct data structures for assembling large genomes , 2010, Bioinform..

[35] Wai Yee Low,et al. Introduction to next generation sequencing technologies , 2017 .

[36] Minh Duc Cao,et al. Chiron: translating nanopore raw signal directly into nucleotide sequence using deep learning , 2017, bioRxiv.

[37] James B. Brown,et al. BasecRAWller: Streaming Nanopore Basecalling Directly from Raw Signal , 2017, bioRxiv.

[38] Gonçalo R. Abecasis,et al. The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[39] Thomas L. Madden,et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.