Lacking alignments? The next-generation sequencing mapper segemehl revisited

MOTIVATION Next-generation sequencing has become an important tool in molecular biology. Various protocols to investigate genomic, transcriptomic and epigenomic features across virtually all species and tissues have been devised. For most of these experiments, one of the first crucial steps of bioinformatic analysis is the mapping of reads to reference genomes. RESULTS Here, we present thorough benchmarks of our read aligner segemehl in comparison with other state-of-the-art methods. Furthermore, we introduce the tool lack to rescue unmapped RNA-seq reads which works in conjunction with segemehl and many other frequently used split-read aligners. AVAILABILITY lack is distributed together with segemehl and freely available at www.bioinf.uni-leipzig.de/Software/segemehl/.

[1]  Thomas R. Gingeras,et al.  STAR: ultrafast universal RNA-seq aligner , 2013, Bioinform..

[2]  Cole Trapnell,et al.  TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions , 2013, Genome Biology.

[3]  Knut Reinert,et al.  RazerS 3: Faster, fully sensitive read mapping , 2012, Bioinform..

[4]  Richard Durbin,et al.  Fast and accurate long-read alignment with Burrows–Wheeler transform , 2010, Bioinform..

[5]  Cole Trapnell,et al.  Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. , 2010, Nature biotechnology.

[6]  Felix Krueger,et al.  Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications , 2011, Bioinform..

[7]  Nuno A. Fonseca,et al.  Tools for mapping high-throughput sequencing data , 2012, Bioinform..

[8]  R. Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[9]  Philip Hugenholtz,et al.  Shining a Light on Dark Sequencing: Characterising Errors in Ion Torrent PGM Data , 2013, PLoS Comput. Biol..

[10]  Roderic Guigó,et al.  The GEM mapper: fast, accurate and versatile alignment by filtration , 2012, Nature Methods.

[11]  Knut Reinert,et al.  A novel and well-defined benchmarking method for second generation read mapping , 2011, BMC Bioinformatics.

[12]  Oliver Hofmann,et al.  ASTD: The Alternative Splicing and Transcript Diversity database. , 2009, Genomics.

[13]  Peter F. Stadler,et al.  Fast and sensitive mapping of bisulfite-treated sequencing data , 2012, Bioinform..

[14]  W. J. Kent,et al.  BLAT--the BLAST-like alignment tool. , 2002, Genome research.

[15]  Manuel Holtgrewe,et al.  Mason – A Read Simulator for Second Generation Sequencing Data , 2010 .

[16]  Andrea Tanzer,et al.  A multi-split mapping algorithm for circular RNA, splicing, trans-splicing and fusion detection , 2014, Genome Biology.

[17]  Heng Li Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM , 2013, 1303.3997.

[18]  Steven L Salzberg,et al.  Fast gapped-read alignment with Bowtie 2 , 2012, Nature Methods.

[19]  Peter F. Stadler,et al.  Fast Mapping of Short Sequences with Mismatches, Insertions and Deletions Using Index Structures , 2009, PLoS Comput. Biol..

[20]  Michael Q. Zhang,et al.  Updates to the RMAP short-read mapping software , 2009, Bioinform..

[21]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[22]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[23]  Pao-Yang Chen,et al.  BS Seeker: precise mapping for bisulfite sequencing , 2010, BMC Bioinformatics.

[24]  Siu-Ming Yiu,et al.  SOAPsplice: Genome-Wide ab initio Detection of Splice Junctions from RNA-Seq Data , 2011, Front. Gene..

[25]  Cole Trapnell,et al.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome , 2009, Genome Biology.