Comparisons and performance evaluations of RNA-seq alignment tools

Next generation sequencing (NGS) has been widely used in biological and medical researches. NGS facilitates identifying mutations and differentially expressed genes that are causative to diseases. However, its high-throughput feature poses major challenges in performing the analysis, especially the alignment step. The alignment step is critical in the NGS gene expression analysis since the correctness of the advanced steps heavily depends on it. Therefore, we evaluated the performances of the four popular alignment algorithms including Tophat, STAR, MapSplice, and GNSAP based on the simulated data generated by Flux-Simulator. Shorter and longer reference genomes and read lengths were evaluated in the four scenarios. Three indices including time cost, alignment accuracy, and junction detection were utilized to consider the performances of different algorithms. The result shows that the Tophat algorithm has highest alignment accuracy and the STAR algorithm is the fastest one with a little lower accuracy. The MapSplice and GNSAP algorithms are not stable as the STAR and Tophat algorithms, and might encounter more problems in doing the alignment.