Arioc: GPU‐accelerated alignment of short bisulfite‐treated reads

Motivation: The alignment of bisulfite‐treated DNA sequences (BS‐seq reads) to a large genome involves a significant computational burden beyond that required to align non‐bisulfite‐treated reads. In the analysis of BS‐seq data, this can present an important performance bottleneck that can be mitigated by appropriate algorithmic and software‐engineering improvements. One strategy is to modify the read‐alignment algorithms by integrating the logic related to BS‐seq alignment, with the goal of making the software implementation amenable to optimizations that lead to higher speed and greater sensitivity than might otherwise be attainable. Results: We evaluated this strategy using Arioc, a short‐read aligner that uses GPU (general‐purpose graphics processing unit) hardware to accelerate computationally‐expensive programming logic. We integrated the BS‐seq computational logic into both GPU and CPU code throughout the Arioc implementation. We then carried out a read‐by‐read comparison of Arioc's reported alignments with the alignments reported by well‐known CPU‐based BS‐seq read aligners. With simulated reads, Arioc's accuracy is equal to or better than the other read aligners we evaluated. With human sequencing reads, Arioc's throughput is at least 10 times faster than existing BS‐seq aligners across a wide range of sensitivity settings. Availability and implementation: The Arioc software is available for download at https://github.com/RWilton/Arioc. It is released under a BSD open‐source license. Supplementary information: Supplementary data are available at Bioinformatics online.

[1]  Peter F. Stadler,et al.  Fast and sensitive mapping of bisulfite-treated sequencing data , 2012, Bioinform..

[2]  Gert Jan van der Wilt,et al.  Is the $1000 Genome as Near as We Think? A Cost Analysis of Next-Generation Sequencing. , 2016, Clinical chemistry.

[3]  R. Durbin,et al.  Mapping Quality Scores Mapping Short Dna Sequencing Reads and Calling Variants Using P

, 2022 .

[4]  Alexander S. Szalay,et al.  Arioc: high-throughput read alignment with GPU-accelerated exploration of the seed-and-extend search space , 2015, PeerJ.

[5]  Sumio Sugano,et al.  Aberrant transcriptional regulations in cancers: genome, transcriptome and epigenome analysis of lung adenocarcinoma cell lines , 2014, Nucleic acids research.

[6]  Steven L Salzberg,et al.  Fast gapped-read alignment with Bowtie 2 , 2012, Nature Methods.

[7]  O. Gotoh An improved algorithm for matching biological sequences. , 1982, Journal of molecular biology.

[8]  Wei Li,et al.  BSMAP: whole genome bisulfite sequence MAPping program , 2009, BMC Bioinformatics.

[9]  Christoph Bock,et al.  RRBSMAP: a fast, accurate and user-friendly alignment tool for reduced representation bisulfite sequencing , 2012, Bioinform..

[10]  Ting Chen,et al.  WALT: fast and accurate read mapping for bisulfite sequencing , 2016, Bioinform..

[11]  Michael Q. Zhang,et al.  BS-Seeker2: a versatile aligning pipeline for bisulfite sequencing data , 2013, BMC Genomics.

[12]  Joaquín Dopazo,et al.  A new parallel pipeline for DNA methylation analysis of long reads datasets , 2017, BMC Bioinformatics.

[13]  Touati Benoukraf,et al.  Methodological aspects of whole-genome bisulfite sequencing analysis , 2015, Briefings Bioinform..

[14]  L. Milanesi,et al.  GPU-BSM: A GPU-Based Tool to Map Bisulfite-Treated Reads , 2014, PloS one.

[15]  Thomas K. F. Wong,et al.  SOAP3-dp: Fast, Accurate and Sensitive GPU-Based Short Read Aligner , 2013, PloS one.

[16]  Ting Chen,et al.  PerM: efficient mapping of short sequencing reads with periodic full sensitive spaced seeds , 2009, Bioinform..

[17]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[18]  Alberto Policriti,et al.  Fast, accurate, and lightweight analysis of BS-treated reads with ERNE 2 , 2016, BMC Bioinformatics.

[19]  Felix Krueger,et al.  Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications , 2011, Bioinform..