A dynamic pipeline for RNA sequencing on multicore processors

We present a concurrent algorithm for mapping short and long RNA sequences on multicore processors. Our solution processes the data, initially stored on disk, in batches of reads which are passed between the consecutive stages of a pipeline. A major operational reorganization of the original static pipeline, combined with a complete reimplementation based on POSIX threads, renders a dissociated execution between threads and stages/task types, so that threads can compute any type of pending task resulting in a dynamic pipeline. The experiments on a multicore platform reveal that this reorganization yields significantly higher performance, specially for architectures equipped with a small to moderate number of cores. As an additional contribution, our experiments also reveal that the use of 16-nucleotide (nt) seeds during the one of the stages of the pipeline, instead of the 15-nt length that was proposed originally, yields a remarkable reduction in the execution time of the global alignment process while maintaining the sensitivity of the algorithm.