HPG pore: an efficient and scalable framework for nanopore sequencing data

BackgroundThe use of nanopore technologies is expected to spread in the future because they are portable and can sequence long fragments of DNA molecules without prior amplification. The first nanopore sequencer available, the MinION™ from Oxford Nanopore Technologies, is a USB-connected, portable device that allows real-time DNA analysis. In addition, other new instruments are expected to be released soon, which promise to outperform the current short-read technologies in terms of throughput. Despite the flood of data expected from this technology, the data analysis solutions currently available are only designed to manage small projects and are not scalable.ResultsHere we present HPG Pore, a toolkit for exploring and analysing nanopore sequencing data. HPG Pore can run on both individual computers and in the Hadoop distributed computing framework, which allows easy scale-up to manage the large amounts of data expected to result from extensive use of nanopore technologies in the future.ConclusionsHPG Pore allows for virtually unlimited sequencing data scalability, thus guaranteeing its continued management in near future scenarios. HPG Pore is available in GitHub at http://github.com/opencb/hpg-pore.

[1]  Mick Watson,et al.  Successful test launch for nanopore sequencing , 2015, Nature Methods.

[2]  Aaron R. Quinlan,et al.  Poretools: a toolkit for analyzing nanopore sequence data , 2014, bioRxiv.

[3]  Yunfan Fan,et al.  Nanopore sequencing detects structural variants in cancer , 2015, bioRxiv.

[4]  Aaron R Quinlan,et al.  A reference bacterial genome dataset generated on the MinION™ portable single-molecule nanopore sequencer , 2014, GigaScience.

[5]  Benedict Paten,et al.  Improved data analysis for the MinION nanopore sequencer , 2015, Nature Methods.

[6]  Doug Stryke,et al.  Rapid metagenomic identification of viral pathogens in clinical samples by real-time nanopore sequencing analysis , 2015, Genome Medicine.

[7]  P. Ashton,et al.  MinION nanopore sequencing identifies the position and structure of a bacterial antibiotic resistance island , 2014, Nature Biotechnology.

[8]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[9]  Mick Watson,et al.  poRe: an R package for the visualization and analysis of nanopore sequencing data , 2015, Bioinform..

[10]  Joshua Quick,et al.  Rapid draft sequencing and real-time nanopore sequencing in a hospital outbreak of Salmonella , 2015, Genome Biology.

[11]  Michael C. Schatz,et al.  Oxford Nanopore Sequencing, Hybrid Error Correction, and de novo Assembly of a Eukaryotic Genome , 2015 .

[12]  B. Graveley,et al.  Determining exon connectivity in complex mRNAs by nanopore sequencing , 2015, Genome Biology.

[13]  Stefan Engelen,et al.  Genome assembly using Nanopore-guided long and error-free DNA reads , 2015, BMC Genomics.

[14]  Julian Parkhill,et al.  Early insights into the potential of the Oxford Nanopore MinION for the detection of antimicrobial resistance genes , 2015, The Journal of antimicrobial chemotherapy.

[15]  M. Forsman,et al.  Scaffolding of a bacterial genome using MinION nanopore sequencing , 2015, Scientific Reports.

[16]  Robert P. Davey,et al.  NanoOK: multi-reference alignment analysis of nanopore sequencing data, quality and error profiles , 2015, Bioinform..

[17]  Ignacio Blanquer,et al.  Acceleration of short and long DNA read mapping without loss of accuracy using suffix array , 2014, Bioinform..

[18]  N. Loman,et al.  A complete bacterial genome assembled de novo using only nanopore sequencing data , 2015, Nature Methods.

[19]  Alvin T. Liem,et al.  Bacterial and viral identification and differentiation by amplicon sequencing on the MinION nanopore sequencer , 2015, GigaScience.