ATGC transcriptomics: a web-based application to integrate, explore and analyze de novo transcriptomic data

BackgroundIn the last years, applications based on massively parallelized RNA sequencing (RNA-seq) have become valuable approaches for studying non-model species, e.g., without a fully sequenced genome. RNA-seq is a useful tool for detecting novel transcripts and genetic variations and for evaluating differential gene expression by digital measurements. The large and complex datasets resulting from functional genomic experiments represent a challenge in data processing, management, and analysis. This problem is especially significant for small research groups working with non-model species.ResultsWe developed a web-based application, called ATGC transcriptomics, with a flexible and adaptable interface that allows users to work with new generation sequencing (NGS) transcriptomic analysis results using an ontology-driven database. This new application simplifies data exploration, visualization, and integration for a better comprehension of the results.ConclusionsATGC transcriptomics provides access to non-expert computer users and small research groups to a scalable storage option and simple data integration, including database administration and management. The software is freely available under the terms of GNU public license at http://atgcinta.sourceforge.net.

[1]  Mark Yandell,et al.  MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects , 2011, BMC Bioinformatics.

[2]  Bernardo J. Clavijo,et al.  Development, Characterization and Experimental Validation of a Cultivated Sunflower (Helianthus annuus L.) Gene Expression Oligonucleotide Microarray , 2012, PloS one.

[3]  Daniel J. Gaffney,et al.  A survey of best practices for RNA-seq data analysis , 2016, Genome Biology.

[4]  Valentin Guignon,et al.  The Banana Genome Hub , 2013, Database J. Biol. Databases Curation.

[5]  Stephen P. Ficklin,et al.  Chado use case: storing genomic, genetic and breeding data of Rosaceae and Gossypium crops in Chado , 2016, Database J. Biol. Databases Curation.

[6]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[7]  Maureen J Donlin,et al.  Using the Generic Genome Browser (GBrowse) , 2007, Current protocols in bioinformatics.

[8]  Scott Cain,et al.  GMODWeb: a web framework for the generic model organism database , 2008, Genome Biology.

[9]  Mario Cáccamo,et al.  ssahaSNP - a polymorphism detection tool on a whole genome scale , 2005, 2005 IEEE Computational Systems Bioinformatics Conference - Workshops (CSBW'05).

[10]  Karen Eilbeck,et al.  Evolution of the Sequence Ontology terms and relationships , 2009, J. Biomed. Informatics.

[11]  Stephen P. Ficklin,et al.  A Systems-Genetics Approach and Data Mining Tool to Assist in the Discovery of Genes Underlying Complex Traits in Oryza sativa , 2013, PloS one.

[12]  Martin O. Jones,et al.  Badger—an accessible genome exploration environment , 2013, Bioinform..

[13]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[14]  David Osumi-Sutherland,et al.  FlyBase: enhancing Drosophila Gene Ontology annotations , 2008, Nucleic Acids Res..

[15]  Ping Zheng,et al.  CottonGen: a genomics, genetics and breeding database for cotton research , 2013, Nucleic Acids Res..

[16]  Stephen P. Ficklin,et al.  Tripal: a construction toolkit for online genome databases , 2011, Database J. Biol. Databases Curation.

[17]  Rolf Apweiler,et al.  InterProScan - an integration platform for the signature-recognition methods in InterPro , 2001, Bioinform..

[18]  R. Varshney,et al.  Exploiting EST databases for the development and characterization of gene-derived SSR-markers in barley (Hordeum vulgare L.) , 2003, Theoretical and Applied Genetics.

[19]  A. Nekrutenko,et al.  Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences , 2010, Genome Biology.

[20]  N. Paniego,et al.  De novo assembly and characterization of leaf transcriptome for the development of functional molecular markers of the extremophile multipurpose tree species Prosopis alba , 2013, BMC Genomics.

[21]  Chris Mungall,et al.  A Chado case study: an ontology-based modular schema for representing genome-associated biological information , 2007, ISMB/ECCB.

[22]  Stephen P. Ficklin,et al.  Tripal v1.1: a standards-based toolkit for construction of online genetic and genomic databases , 2013, Database J. Biol. Databases Curation.

[23]  Bernardo J. Clavijo,et al.  New insights into the wheat chromosome 4D structure and virtual gene order, revealed by survey pyrosequencing , 2015, Plant science : an international journal of experimental plant biology.

[24]  R. Durbin,et al.  The Sequence Ontology: a tool for the unification of genome annotations , 2005, Genome Biology.

[25]  Katja C. Seltmann,et al.  A Gross Anatomy Ontology for Hymenoptera , 2010, PloS one.

[26]  E. Birney,et al.  Apollo: a sequence annotation editor , 2002, Genome Biology.

[27]  Jonathan Crabtree,et al.  Ergatis: a web interface and scalable software system for bioinformatics workflows , 2010, Bioinform..

[28]  Juan Miguel García-Gómez,et al.  BIOINFORMATICS APPLICATIONS NOTE Sequence analysis Manipulation of FASTQ data with Galaxy , 2005 .

[29]  Colin N. Dewey,et al.  De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis , 2013, Nature Protocols.

[30]  Sergio Gonzalez,et al.  Transcriptome profiling of Diachasmimorpha longicaudata towards useful molecular tools for population management , 2016, BMC Genomics.