Tedna: a transposable element de novo assembler

MOTIVATION Recent technological advances are allowing many laboratories to sequence their research organisms. Available de novo assemblers leave repetitive portions of the genome poorly assembled. Some genomes contain high proportions of transposable elements, and transposable elements appear to be a major force behind diversity and adaptation. Few de novo assemblers for transposable elements exist, and most have either been designed for small genomes or 454 reads. RESULTS In this article, we present a new transposable element de novo assembler, Tedna, which assembles a set of transposable elements directly from the reads. Tedna uses Illumina paired-end reads, the most widely used sequencing technology for de novo assembly, and forms full-length transposable elements. AVAILABILITY AND IMPLEMENTATION Tedna is available at http://urgi.versailles.inra.fr/Tools/Tedna, under the GPLv3 license. It is written in C++11 and only requires the Sparsehash Package, freely available under the New BSD License. Tedna can be used on standard computers with limited RAM resources, although it may also use large memory for better results. Most of the code is parallelized and thus ready for large infrastructures.