Genomic and transcriptomic resources for assassin flies including the complete genome sequence of Proctacanthus coquilletti (Insecta: Diptera: Asilidae) and 16 representative transcriptomes

A high-quality draft genome for Proctacanthus coquilletti (Insecta: Diptera: Asilidae) is presented along with transcriptomes for 16 Diptera species from five families: Asilidae, Apioceridae, Bombyliidae, Mydidae, and Tabanidae. Genome sequencing reveals that P. coquilletti has a genome size of approximately 210 Mbp and remarkably low heterozygosity (0.47%) and few repeats (15%). These characteristics helped produce a highly contiguous (N50 = 862 kbp) assembly, particularly given that only a single 2 × 250 bp PCR-free Illumina library was sequenced. A phylogenomic hypothesis is presented based on thousands of putative orthologs across the 16 transcriptomes. Phylogenetic relationships support the sister group relationship of Apioceridae + Mydidae to Asilidae. A time-calibrated phylogeny is also presented, with seven fossil calibration points, which suggests an older age of the split among Apioceridae, Asilidae, and Mydidae (158 mya) and Apioceridae and Mydidae (135 mya) than proposed in the AToL FlyTree project. Future studies will be able to take advantage of the resources presented here in order to produce large scale phylogenomic and evolutionary studies of assassin fly phylogeny, life histories, or venom. The bioinformatics tools and workflow presented here will be useful to others wishing to generate de novo genomic resources in species-rich taxa without a closely-related reference genome.

[1]  R. Lanfear,et al.  Partitionfinder: combined selection of partitioning schemes and substitution models for phylogenetic analyses. , 2012, Molecular biology and evolution.

[2]  Ewan Birney,et al.  Automated generation of heuristics for biological sequence comparison , 2005, BMC Bioinformatics.

[3]  D. Yeates,et al.  Molecular phylogeny of the horse flies: a framework for renewing tabanid taxonomy , 2016 .

[4]  D. Grimaldi,et al.  Robber Flies in Cretaceous Ambers (Insecta: Diptera: Asilidae) , 2014 .

[5]  Derrick E. Wood,et al.  Kraken: ultrafast metagenomic sequence classification using exact alignments , 2014, Genome Biology.

[6]  O. Kohany,et al.  Repbase Update, a database of repetitive elements in eukaryotic genomes , 2015, Mobile DNA.

[7]  Sofia M. C. Robb,et al.  MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes. , 2007, Genome research.

[8]  N. Friedman,et al.  Trinity: reconstructing a full-length transcriptome without a genome from RNA-Seq data , 2011, Nature Biotechnology.

[9]  S. Paramonov A review of Australian Apioceridae (Diptera) , 1953 .

[10]  Ziheng Yang PAML 4: phylogenetic analysis by maximum likelihood. , 2007, Molecular biology and evolution.

[11]  Robert D. Finn,et al.  HMMER web server: interactive sequence similarity searching , 2011, Nucleic Acids Res..

[12]  Evgeny M. Zdobnov,et al.  BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs , 2015, Bioinform..

[13]  James P. Braselton,et al.  CHAPTER 9 , 2019, On Job, Volume 1.

[14]  Rónán Daly,et al.  Inferring gene regulatory networks from classified microarray data: Initial results , 2005, BMC Bioinformatics.

[15]  T. Dikow A phylogenetic hypothesis for Asilidae based on a total evidence analysis of morphological and DNA sequence data (Insecta: Diptera: Brachycera: Asiloidea) , 2009 .

[16]  M. Blaxter,et al.  Blobology: exploring raw genome data for contaminants, symbionts and parasites using taxon-annotated GC-coverage plots , 2013, Front. Genet..

[17]  D. Yeates,et al.  A multigene phylogeny of the fly superfamily Asiloidea (Insecta): Taxon sampling and additional genes reveal the sister-group to all higher flies (Cyclorrhapha). , 2010, Molecular phylogenetics and evolution.

[18]  Steven L Salzberg,et al.  Fast gapped-read alignment with Bowtie 2 , 2012, Nature Methods.

[19]  T. Dikow Phylogeny of Asilidae Inferred from Morphological Characters of Imagines (Insecta: Diptera: Brachycera: Asiloidea) , 2009 .

[20]  D. Bachtrog,et al.  Numerous Transitions of Sex Chromosomes in Diptera , 2015, PLoS biology.

[21]  D. Grimaldi,et al.  Insects from the Santana Formation, Lower Cretaceous, of Brazil. Bulletin of the AMNH ; no. 195 , 1990 .

[22]  P. Deininger Jerzy Jurka – 1950–2014 , 2015, Mobile DNA.

[23]  D. Martill,et al.  The Crato Fossil Beds of Brazil: Window into an Ancient World , 2007 .

[24]  Mario Stanke,et al.  Gene prediction with a hidden Markov model and a new intron submodel , 2003, ECCB.

[25]  Tandy J. Warnow,et al.  ASTRAL: genome-scale coalescent-based species tree estimation , 2014, Bioinform..

[26]  D. Yeates,et al.  The evolution and biogeography of the austral horse fly tribe Scionini (Diptera: Tabanidae: Pangoniinae) inferred from multiple mitochondrial and nuclear genes. , 2013, Molecular phylogenetics and evolution.

[27]  Carl Kingsford,et al.  A fast, lock-free approach for efficient parallel counting of occurrences of k-mers , 2011, Bioinform..

[28]  Kazutaka Katoh,et al.  Multiple alignment of DNA sequences with MAFFT. , 2009, Methods in molecular biology.

[29]  Björn Usadel,et al.  Trimmomatic: a flexible trimmer for Illumina sequence data , 2014, Bioinform..

[30]  Lukas Wagner,et al.  A Greedy Algorithm for Aligning DNA Sequences , 2000, J. Comput. Biol..

[31]  Ian Korf,et al.  Gene finding in novel genomes , 2004, BMC Bioinformatics.

[32]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[33]  David Penney,et al.  Order Araneae Clerck, 1757. In: Zhang, Z.-Q. (Ed.) Animal biodiversity: An outline of higher-level classification and survey of taxonomic richness , 2011 .

[34]  R. Zack Catalogue of the Fossil Flies of the World (Insecta: Diptera) , 1996 .

[35]  Evgeny M. Zdobnov,et al.  OrthoDB v8: update of the hierarchical catalog of orthologs and the underlying free software , 2014, Nucleic Acids Res..

[36]  E. Birney,et al.  Pfam: the protein families database , 2013, Nucleic Acids Res..

[37]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[38]  Zhengwei Zhu,et al.  CD-HIT: accelerated for clustering the next-generation sequencing data , 2012, Bioinform..

[39]  Zhi-qiang Zhang An outline of higher-level classification and survey of taxonomic richness , 2011 .

[40]  Alexandros Stamatakis,et al.  RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies , 2014, Bioinform..

[41]  R. Wharton Observations on the behaviour, phenology and habitat preferences of mydas flies in the central Namib Desert , 1982 .

[42]  Markus Friedrich,et al.  Episodic radiations in the fly tree of life , 2011, Proceedings of the National Academy of Sciences.

[43]  K. Miller,et al.  45 , 2014, Tao te Ching.