Strand Orientation Bias Detector (SOBDetector) to remove artifacts from sequencing data of formalin fixed samples

Summary: Due to its effectiveness and simplicity, formalin-fixed paraffin-embedded FFPE tissue processing is the most common approach for tissue specimen storage. It is well known however, that formalin reacts with nitrogenous bases (especially with cytosine) and causes DNA lesions, which are sources of sequencing artifacts. This makes the next generation sequencing (NGS) analyses of these specimens challenging. Here we present SOBDetector, a simple, user-friendly application that can remove the majority of such artifacts from the list of detected variants, by analyzing the orientation of paired end sequences that are spanning the variants' locations. While it was originally created for human somatic mutation filtration, it can be incorporated into any variant calling pipelines of any species. Availability and Implementation: SOBDetector is implemented in java 1.8, and it is freely available at: www.github.com/mikdio/SOBDetector. Contact: mikdio@bioinformatics.dtu.dk, Zoltan.Szallasi@childrens.harvard.edu. Supplementary information: Supplementary data are available at Bioinformatics online.