The Dfam community resource of transposable element families, sequence models, and genome annotations

Dfam is an open access database of repetitive DNA families, sequence models, and genome annotations. The 3.0–3.3 releases of Dfam ( https://dfam.org ) represent an evolution from a proof-of-principle collection of transposable element families in model organisms into a community resource for a broad range of species, and for both curated and uncurated datasets. In addition, releases since Dfam 3.0 provide auxiliary consensus sequence models, transposable element protein alignments, and a formalized classification system to support the growing diversity of organisms represented in the resource. The latest release includes 266,740 new de novo generated transposable element families from 336 species contributed by the EBI. This expansion demonstrates the utility of many of Dfam’s new features and provides insight into the long term challenges ahead for improving de novo generated transposable element datasets.

[1]  Kaitlin Carey,et al.  Transposable Element Subfamily Annotation is Unreliable in Biological Replicates , 2020 .

[2]  Cédric Feschotte,et al.  RepeatModeler2 for automated genomic discovery of transposable element families , 2020, Proceedings of the National Academy of Sciences.

[3]  Lu Sun,et al.  NCBI Taxonomy: a comprehensive update on curation, resources and tools , 2020, Database J. Biol. Databases Curation.

[4]  Andrew G. Clark,et al.  RepeatModeler2: automated genomic discovery of transposable element families , 2019, bioRxiv.

[5]  C. M. Wai,et al.  GingerRoot: A Novel DNA Transposon Encoding Integrase-Related Transposase in Plants and Animals , 2019, Genome biology and evolution.

[6]  R. Appels Wheat research and breeding in the new era of a high-quality reference genome , 2019, Frontiers of Agricultural Science and Engineering.

[7]  Daniel Olson,et al.  ULTRA: A Model Based Tool to Detect Tandem Repeats , 2018, BCB.

[8]  R. Ramírez-González,et al.  Impact of transposable elements on genome structure and evolution in bread wheat , 2018, Genome Biology.

[9]  I. Arkhipova Using bioinformatic and phylogenetic approaches to classify transposable elements and understand their complex evolutionary histories , 2017, Mobile DNA.

[10]  Nicolae Radu Zabet,et al.  High-frequency recombination between members of an LTR retrotransposon family during transposition bursts , 2017, Nature Communications.

[11]  Shlomo Havlin,et al.  Integrating networks and comparative genomics reveals retroelement proliferation dynamics in hominid genomes , 2017, Science Advances.

[12]  Jeffrey P. Townsend,et al.  A comprehensive phylogeny of birds (Aves) using targeted next-generation DNA sequencing , 2016, Nature.

[13]  Robert D. Finn,et al.  The Dfam database of repetitive DNA families , 2015, Nucleic Acids Res..

[14]  J. Townsend,et al.  A comprehensive phylogeny of birds (Aves) using targeted next-generation DNA sequencing , 2015, Nature.

[15]  A. Damert Composite non-LTR retrotransposons in hominoid primates , 2015, Mobile genetic elements.

[16]  Yves Bigot,et al.  A survey of transposable element classification systems--a call for a fundamental update to meet the challenge of their diversity and complexity. , 2015, Molecular phylogenetics and evolution.

[17]  David Haussler,et al.  An evolutionary arms race between KRAB zinc finger genes 91/93 and SVA/L1 retrotransposons , 2014, Nature.

[18]  H. Quesneville,et al.  PASTEC: An Automatic Transposable Element Classification Tool , 2014, PloS one.

[19]  Sean R. Eddy,et al.  nhmmer: DNA homology search with profile HMMs , 2013, Bioinform..

[20]  Robert D. Finn,et al.  Dfam: a database of repetitive DNA based on profile hidden Markov models , 2012, Nucleic Acids Res..

[21]  L. Harmon,et al.  OneZoom: A Fractal Explorer for the Tree of Life , 2012, PLoS biology.

[22]  Kyudong Han,et al.  High Levels of Sequence Diversity in the 5′ UTRs of Human-Specific L1 Elements , 2012, Comparative and functional genomics.

[23]  Robert D. Finn,et al.  Rfam: Wikipedia, clans and the “decimal” release , 2010, Nucleic Acids Res..

[24]  Fred Dyda,et al.  Integrating prokaryotes and eukaryotes: DNA transposases in light of structure , 2010, Critical reviews in biochemistry and molecular biology.

[25]  Liisa Holm,et al.  The Pfam protein families database , 2009, Nucleic acids research.

[26]  J. Jurka,et al.  Ginger DNA transposons in eukaryotes and their evolutionary relationships with long terminal repeat retrotransposons , 2010, Mobile DNA.

[27]  Nirmal Ranganathan,et al.  Exploring Repetitive DNA Landscapes Using REPCLASS, a Tool That Automates the Classification of Transposable Elements in Eukaryotic Genomes , 2009, Genome biology and evolution.

[28]  György Abrusán,et al.  TEclass - a tool for automated classification of unknown eukaryotic transposable elements , 2009, Bioinform..

[29]  G. Petersen,et al.  A unified classification system for eukaryotic transposable elements should reflect their phylogeny , 2009, Nature Reviews Genetics.

[30]  E. Birney,et al.  Pfam: the protein families database , 2013, Nucleic Acids Res..

[31]  J. Jurka,et al.  A universal classification of eukaryotic transposable elements implemented in Repbase , 2008, Nature Reviews Genetics.

[32]  J. Bennetzen,et al.  A unified classification system for eukaryotic transposable elements , 2007, Nature Reviews Genetics.

[33]  Pavel A. Pevzner,et al.  De novo identification of repeat families in large genomes , 2005, ISMB.

[34]  P. Pevzner,et al.  Whole-genome analysis of Alu repeat elements reveals complex evolutionary history. , 2004, Genome research.

[35]  Aleksandar Milosavljevic,et al.  Prototypic sequences for human repetitive DNA , 1992, Journal of Molecular Evolution.

[36]  KharHengChoo,et al.  Recent Applications of Hidden Markov Models in Computational Biology , 2004 .

[37]  Keith M. Derbyshire,et al.  The outs and ins of transposition: from Mu to Kangaroo , 2003, Nature Reviews Molecular Cell Biology.

[38]  Hilde van der Togt,et al.  Publisher's Note , 2003, J. Netw. Comput. Appl..

[39]  T. D. Schneider,et al.  Consensus sequence Zen. , 2002, Applied bioinformatics.

[40]  T. Eickbush,et al.  Phylogenetic analysis of ribonuclease H domains suggests a late, chimeric origin of LTR retrotransposable elements and retroviruses. , 2001, Genome research.

[41]  International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome , 2001, Nature.

[42]  G. Benson,et al.  Tandem repeats finder: a program to analyze DNA sequences. , 1999, Nucleic acids research.

[43]  D. Haussler,et al.  Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods. , 1998, Journal of molecular biology.

[44]  R. Plasterk,et al.  Molecular Reconstruction of Sleeping Beauty , a Tc1-like Transposon from Fish, and Its Transposition in Human Cells , 1997, Cell.

[45]  A. Smit,et al.  The origin of interspersed repeats in the human genome. , 1996, Current opinion in genetics & development.

[46]  N. Okada,et al.  The 3' ends of tRNA-derived short interspersed repetitive elements are derived from the 3' ends of long interspersed repetitive elements , 1996, Molecular and cellular biology.

[47]  R. Plasterk,et al.  Transposase is the only nematode protein required for in vitro transposition of Tc1. , 1996, Genes & development.

[48]  A. Smit,et al.  Tiggers and DNA transposon fossils in the human genome. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[49]  Ronald H. A. Plasterk,et al.  The mechanism of transposition of Tc3 in C. elegans , 1994, Cell.

[50]  D. Haussler,et al.  Hidden Markov models in computational biology. Applications to protein modeling. , 1993, Journal of molecular biology.

[51]  D. Finnegan,et al.  Eukaryotic transposable elements and genome evolution. , 1989, Trends in genetics : TIG.

[52]  M S Waterman,et al.  Regulatory pattern identification in nucleic acid sequences. , 1983, Nucleic acids research.

[53]  C. Schmid,et al.  Base sequence studies of 300 nucleotide renatured repeated human DNA clones. , 1981, Journal of molecular biology.