Growing and cultivating the forest genomics database, TreeGenes

Abstract Forest trees are valued sources of pulp, timber and biofuels, and serve a role in carbon sequestration, biodiversity maintenance and watershed stability. Examining the relationships among genetic, phenotypic and environmental factors for these species provides insight on the areas of concern for breeders and researchers alike. The TreeGenes database is a web-based repository that is home to 1790 tree species and over 1500 registered users. The database provides a curated archive for high-throughput genomics, including reference genomes, transcriptomes, genetic maps and variant data. These resources are paired with extensive phenotypic information and environmental layers. TreeGenes recently migrated to Tripal, an integrated and open-source database schema and content management system. This migration enabled developments focused on data exchange, data transfer and improved analytical capacity, as well as providing TreeGenes the opportunity to communicate with the following partner databases: Hardwood Genomics Web, Genome Database for Rosaceae, and the Citrus Genome Database. Recent development in TreeGenes has focused on coordinating information for georeferenced accessions, including metadata acquisition and ontological frameworks, to improve integration across studies combining genetic, phenotypic and environmental data. This focus was paired with the development of tools to enable comparative genomics and data visualization. By combining advanced data importers, relevant metadata standards and integrated analytical frameworks, TreeGenes provides a platform for researchers to store, submit and analyze forest tree data.

[1]  Antoine Harfouche,et al.  UAV-Based Thermal Imaging for High-Throughput Field Phenotyping of Black Poplar Response to Drought , 2017, Front. Plant Sci..

[2]  C. Stoeckert,et al.  OrthoMCL: identification of ortholog groups for eukaryotic genomes. , 2003, Genome research.

[3]  Chao Xie,et al.  Fast and sensitive protein alignment using DIAMOND , 2014, Nature Methods.

[4]  A. G. Abbott,et al.  Uniform standards for genome databases in forest and fruit trees , 2012, Tree Genetics & Genomes.

[5]  Xiao Zhou,et al.  New extension software modules to enhance searching and display of transcriptome data in Tripal databases , 2017, Database J. Biol. Databases Curation.

[6]  John Chilton,et al.  The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update , 2016, Nucleic Acids Res..

[7]  Arllet M. Portugal,et al.  Bridging the phenotypic and genetic data useful for integrated breeding through a data annotation using the Crop Ontology developed by the crop communities of practice , 2012, Front. Physio..

[8]  David M. Goodstein,et al.  Phytozome: a comparative platform for green plant genomics , 2011, Nucleic Acids Res..

[9]  Chris Mungall,et al.  A Chado case study: an ontology-based modular schema for representing genome-associated biological information , 2007, ISMB/ECCB.

[10]  Christoph Steinbeck,et al.  ChEBI in 2016: Improved services and an expanding collection of metabolites , 2015, Nucleic Acids Res..

[11]  Antonio Trabucco,et al.  Trees and water: smallholder agroforestry on irrigated lands in Northern India , 2007 .

[12]  Stephen P. Ficklin,et al.  Tripal v1.1: a standards-based toolkit for construction of online genetic and genomic databases , 2013, Database J. Biol. Databases Curation.

[13]  Ping Zheng,et al.  The Genome Database for Rosaceae (GDR): year 10 update , 2013, Nucleic Acids Res..

[14]  Jill L. Wegrzyn,et al.  TreeGenes: A Forest Tree Genome Database , 2008, International journal of plant genomics.

[15]  Eugene Zhang,et al.  The Planteome database: an integrated resource for reference ontologies, plant genomics and phenomics , 2017, Nucleic Acids Res..

[16]  S. Kelly,et al.  OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy , 2015, Genome Biology.

[17]  Uwe Scholz,et al.  Measures for interoperability of phenotypic data: minimum information requirements and formatting , 2016, Plant Methods.

[18]  David B. Neale,et al.  The Evolution of Forest Genetics and Tree Improvement Research in the United States , 2015 .

[19]  D. Neale,et al.  Forest tree genomics: growing resources and applications , 2011, Nature Reviews Genetics.

[20]  Stephen E. Fick,et al.  WorldClim 2: new 1‐km spatial resolution climate surfaces for global land areas , 2017 .

[21]  L. Stein,et al.  JBrowse: a next-generation genome browser. , 2009, Genome research.

[22]  Margaret E. Staton,et al.  Genomics of Fagaceae , 2012, Tree Genetics & Genomes.

[23]  Suzanna E Lewis,et al.  JBrowse: a dynamic web platform for genome visualization and analysis , 2016, Genome Biology.

[24]  L. Stein The case for cloud computing in genome informatics , 2010, Genome Biology.

[25]  Arun Prabhu Dhanapal,et al.  Unlimited Thirst for Genome Sequencing, Data Interpretation, and Database Usage in Genomic Era: The Road towards Fast-Track Crop Plant Improvement , 2015, Genetics research international.

[26]  Thomas L. Madden,et al.  Domain enhanced lookup time accelerated BLAST , 2012, Biology Direct.

[27]  S. Aitken,et al.  Time to get moving: assisted gene flow of forest trees , 2015, Evolutionary applications.

[28]  S. Higgins,et al.  TRY – a global database of plant traits , 2011, Global Change Biology.

[29]  Nic Herndon,et al.  CartograTree: Enabling landscape genomics for forest trees , 2016, PeerJ Prepr..

[30]  Antonio Trabucco,et al.  Climate change mitigation: a spatial analysis of global land suitability for Clean Development Mechanism afforestation and reforestation , 2008 .