SoyTEdb: a comprehensive database of transposable elements in the soybean genome

BackgroundTransposable elements are the most abundant components of all characterized genomes of higher eukaryotes. It has been documented that these elements not only contribute to the shaping and reshaping of their host genomes, but also play significant roles in regulating gene expression, altering gene function, and creating new genes. Thus, complete identification of transposable elements in sequenced genomes and construction of comprehensive transposable element databases are essential for accurate annotation of genes and other genomic components, for investigation of potential functional interaction between transposable elements and genes, and for study of genome evolution. The recent availability of the soybean genome sequence has provided an unprecedented opportunity for discovery, and structural and functional characterization of transposable elements in this economically important legume crop.DescriptionUsing a combination of structure-based and homology-based approaches, a total of 32,552 retrotransposons (Class I) and 6,029 DNA transposons (Class II) with clear boundaries and insertion sites were structurally annotated and clearly categorized, and a soybean transposable element database, SoyTEdb, was established. These transposable elements have been anchored in and integrated with the soybean physical map and genetic map, and are browsable and visualizable at any scale along the 20 soybean chromosomes, along with predicted genes and other sequence annotations. BLAST search and other infrastracture tools were implemented to facilitate annotation of transposable elements or fragments from soybean and other related legume species. The majority (> 95%) of these elements (particularly a few hundred low-copy-number families) are first described in this study.ConclusionSoyTEdb provides resources and information related to transposable elements in the soybean genome, representing the most comprehensive and the largest manually curated transposable element database for any individual plant genome completely sequenced to date. Transposable elements previously identified in legumes, the third largest family of flowering plants, are relatively scarce. Thus this database will facilitate structural, evolutionary, functional, and epigenetic analyses of transposable elements in soybean and other legume species.

[1]  J. Bennetzen,et al.  Nested Retrotransposons in the Intergenic Regions of the Maize Genome , 1996, Science.

[2]  M. Morgante,et al.  Abundance, distribution, and transcriptional activity of repetitive elements in the maize genome. , 2001, Genome research.

[3]  Sean R. Eddy,et al.  Pack-MULE transposable elements mediate gene evolution in plants , 2004, Nature.

[4]  James K. M. Brown,et al.  Genome size reduction through illegitimate recombination counteracts genome expansion in Arabidopsis. , 2002, Genome research.

[5]  S. Jackson,et al.  Retrotransposon accumulation and satellite amplification mediated by segmental duplication facilitate centromere expansion in rice. , 2005, Genome research.

[6]  Andrea Zuccolo,et al.  RetrOryza: a database of the rice LTR-retrotransposons , 2006, Nucleic Acids Res..

[7]  Jianxin Ma,et al.  Analysis and mapping of randomly chosen bacterial artificial chromosome clones from hexaploid bread wheat. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[8]  李佩芳 International Rice Genome Sequencing Project. 2005. The map-based sequence of the rice genome. , 2005 .

[9]  Takuji Sasaki,et al.  The map-based sequence of the rice genome , 2005, Nature.

[10]  A. Levy,et al.  Transcriptional activation of retrotransposons alters the expression of adjacent genes in wheat , 2003, Nature Genetics.

[11]  M. Morgante,et al.  Gene duplication and exon shuffling by helitron-like transposons generate intraspecies diversity in maize , 2005, Nature Genetics.

[12]  J. Bennetzen,et al.  Recombination, rearrangement, reshuffling, and divergence in a centromeric region of rice. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[13]  E. Gaucher,et al.  SIRE-1, a copia/Ty1-like retroelement from soybean, encodes a retroviral envelope-like protein. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Steven B. Cannon,et al.  SoyBase, the USDA-ARS soybean genetics and genomics database , 2009, Nucleic Acids Res..

[15]  J. Bennetzen,et al.  Structure-based discovery and description of plant and animal Helitrons , 2009, Proceedings of the National Academy of Sciences.

[16]  T. Sakurai,et al.  Genome sequence of the palaeopolyploid soybean , 2010, Nature.

[17]  S. Wessler,et al.  The Transposable Element Landscape of the Model Legume Lotus japonicus , 2006, Genetics.

[18]  R. Shoemaker,et al.  Paleopolyploidy and gene duplication in soybean and other legumes. , 2006, Current opinion in plant biology.

[19]  J. Bennetzen,et al.  A unified classification system for eukaryotic transposable elements , 2007, Nature Reviews Genetics.

[20]  J. Jurka,et al.  Repbase Update, a database of eukaryotic repetitive elements , 2005, Cytogenetic and Genome Research.

[21]  Ethalinda K. S. Cannon,et al.  Replication of Nonautonomous Retroelements in Soybean Appears to Be Both Recent and Common1[W][OA] , 2008, Plant Physiology.

[22]  Arpita Das,et al.  Diaspora, a large family of Ty3-gypsy retrotransposons in Glycine max, is an envelope-less member of an endogenous plant retrovirus lineage , 2005, BMC Evolutionary Biology.

[23]  C. Robin Buell,et al.  The TIGR Plant Repeat Databases: a collective resource for the identification of repetitive sequences in plants , 2004, Nucleic Acids Res..

[24]  J. Bennetzen,et al.  Plant retrotransposons. , 1999, Annual review of genetics.

[25]  Yoshihiro Kawahara,et al.  The Rice Annotation Project Database (RAP-DB): 2008 update , 2007, Nucleic Acids Res..

[26]  K. Lark,et al.  Characterization of Soymar1, a mariner element in soybean. , 1998, Genetics.

[27]  J. Bennetzen,et al.  Transposable elements, gene creation and genome rearrangement in flowering plants. , 2005, Current opinion in genetics & development.

[28]  Jiming Jiang,et al.  Epigenetic Modification of Centromeric Chromatin: Hypomethylation of DNA Sequences in the CENH3-Associated Chromatin in Arabidopsis thaliana and Maize[W][OA] , 2008, The Plant Cell Online.

[29]  Jianxin Ma,et al.  Consistent over-estimation of gene number in complex plant genomes. , 2004, Current opinion in plant biology.

[30]  J. Bennetzen,et al.  Do genetic recombination and gene density shape the pattern of DNA elimination in rice long terminal repeat retrotransposons? , 2009, Genome research.

[31]  Ethalinda K. S. Cannon,et al.  Differential accumulation of retroelements and diversification of NB-LRR disease resistance genes in duplicated regions following polyploidy in the ancestor of soybean. , 2008, Plant physiology.

[32]  Jianxin Ma,et al.  Analyses of LTR-retrotransposon structures reveal recent and rapid genomic DNA loss in rice. , 2004, Genome research.