TieBrush: an efficient method for aggregating and summarizing mapped reads across large datasets

SUMMARY Although the ability to programmatically summarize and visually inspect sequencing data is an integral part of genome analysis, currently available methods are not capable of handling large numbers of samples. In particular, making a visual comparison of transcriptional landscapes between two sets of thousands of RNA-seq samples is limited by available computational resources, which can be overwhelmed due to the sheer size of the data. In this work we present TieBrush, a software package designed to process very large sequencing datasets (RNA, whole-genome, exome, etc) into a form that enables quick visual and computational inspection. TieBrush can also be used as a method for aggregating data for downstream computational analysis, and is compatible with most software tools that take aligned reads as input. AVAILABILITY TieBrush is provided as a C ++ package under the MIT License. Pre-compiled binaries, source code and example data are available on GitHub (https://github.com/alevar/tiebrush). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.