Meta-Alignment: Combining Sequence Aligners for Better Results

Analysing next generation sequencing data often involves the use of a sequence aligner to map the sequenced reads against a reference. The output of this process is the basis of many downstream analyses and its quality thus critical. Many different alignment tools exist, each with a multitude of options, creating a vast amount of possibilities to align sequences. Choosing the correct aligner and options for a specific dataset is complex, and yet it can have a major impact on the quality of the data analysis. We propose a new approach in which we combine the output of multiple sequence aligners to create an improved sequence alignment files. Our novel approach can be used to either increase the sensitivity or the specificity of the alignment process. The software is freely available for non-commercial usage at http://gnaty.phenosystems.com/.

[1]  J. Zook,et al.  An analytical framework for optimizing variant discovery from personal genomes , 2015, Nature Communications.

[2]  Pierre Kuonen,et al.  DNAseq Workflow in a Diagnostic Context and an Example of a User Friendly Implementation , 2015, BioMed research international.

[3]  Beat Wolf Reducing the complexity of OMICS data analysis , 2017 .

[4]  Heng Li,et al.  A survey of sequence alignment algorithms for next-generation sequencing , 2010, Briefings Bioinform..

[5]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[6]  Peter Sarkozy,et al.  VariantMetaCaller: automated fusion of variant calling pipelines for quantitative, precision-based filtering , 2015, BMC Genomics.

[7]  Carlos Caldas,et al.  Intersect-then-combine approach: improving the performance of somatic variant calling in whole exome sequencing data using multiple aligners and callers , 2017, Genome Medicine.

[8]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[9]  Yongchao Liu,et al.  Long read alignment based on maximal exact match seeds , 2012, Bioinform..

[10]  M. DePristo,et al.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.

[11]  Steven L Salzberg,et al.  Fast gapped-read alignment with Bowtie 2 , 2012, Nature Methods.

[12]  O. Gotoh An improved algorithm for matching biological sequences. , 1982, Journal of molecular biology.

[13]  A. Mikheyev,et al.  A first look at the Oxford Nanopore MinION sequencer , 2014, Molecular ecology resources.

[14]  Rongkun Shen,et al.  Corrigendum: miR-218 is essential to establish motor neuron fate as a downstream effector of Isl1–Lhx3 , 2015, Nature Communications.