Graph-based exploitation of gene ontology using GOxploreR for scrutinizing biological significance

Gene ontology (GO) is an eminent knowledge base frequently used for providing biological interpretations for the analysis of genes or gene sets from biological, medical and clinical problems. Unfortunately, the interpretation of such results is challenging due to the large number of GO terms, their hierarchical and connected organization as directed acyclic graphs (DAGs) and the lack of tools allowing to exploit this structural information explicitly. For this reason, we developed the R package GOxploreR. The main features of GOxploreR are (I) easy and direct access to structural features of GO, (II) structure-based ranking of GO-terms, (III) mapping to reduced GO-DAGs including visualization capabilities and (IV) prioritizing of GO-terms. The underlying idea of GOxploreR is to exploit a graph-theoretical perspective of GO as manifested by its DAG-structure and the containing hierarchy levels for cumulating semantic information. That means all these features enhance the utilization of structural information of GO and complement existing analysis tools. Overall, GOxploreR provides exploratory as well as confirmatory tools for complementing any kind of analysis resulting in a list of GO-terms, e.g., from differentially expressed genes or gene sets, GWAS or biomarkers. Our R package GOxploreR is freely available from CRAN.

[1]  J. Nigg,et al.  Functional and genomic context in pathway analysis of GWAS data. , 2014, Trends in genetics : TIG.

[2]  Frank Emmert-Streib,et al.  Bagging Statistical Network Inference from Large-Scale Gene Expression Data , 2012, PloS one.

[3]  David Osumi-Sutherland,et al.  FlyBase: enhancing Drosophila Gene Ontology annotations , 2008, Nucleic Acids Res..

[4]  James C. Hu,et al.  The Gene Ontology Resource: 20 years and still GOing strong , 2019 .

[5]  Holger Fröhlich,et al.  Review Biomarker Gene Signature Discovery Integrating Network Knowledge , 2012 .

[6]  Martin Vingron,et al.  Improved detection of overrepresentation of Gene-Ontology annotations with parent-child analysis , 2007, Bioinform..

[7]  Israel Steinfeld,et al.  BMC Bioinformatics BioMed Central , 2008 .

[8]  Matthias Dehmer,et al.  NetBioV: an R package for visualizing large network data in biology and medicine , 2014, Bioinform..

[9]  F Emmert-Streib,et al.  Networks for systems biology: conceptual connection of data and function. , 2011, IET systems biology.

[10]  The Gene Ontology Consortium,et al.  The Gene Ontology Resource: 20 years and still GOing strong , 2018, Nucleic Acids Res..

[11]  T. Speed,et al.  GOstat: find statistically overrepresented Gene Ontologies within a group of genes. , 2004, Bioinformatics.

[12]  M. Vidal A unifying view of 21st century systems biology , 2009, FEBS letters.

[13]  Paul Pavlidis,et al.  Monitoring changes in the Gene Ontology and their impact on genomic data analysis , 2018, bioRxiv.

[14]  Philip S. Yu,et al.  G-SESAME: web tools for GO-term-based gene similarity analysis and knowledge discovery , 2009, Nucleic Acids Res..

[15]  H. Son,et al.  Categorizer: a tool to categorize genes into user-defined biological groups based on semantic similarity , 2014, BMC Genomics.

[16]  G. Glazko,et al.  Network biology: a direct approach to study biological function , 2011, Wiley interdisciplinary reviews. Systems biology and medicine.

[17]  Judith A. Blake,et al.  Ten Quick Tips for Using the Gene Ontology , 2013, PLoS Comput. Biol..

[18]  Rui Jiang,et al.  From Ontology to Semantic Similarity: Calculation of Ontology-Based Semantic Similarity , 2013, TheScientificWorldJournal.

[19]  Matthias Dehmer,et al.  The gene regulatory network for breast cancer: integrated regulatory landscape of cancer hallmarks , 2014, Front. Genet..

[20]  Paul N. Schofield,et al.  The role of ontologies in biological and biomedical research: a functional perspective , 2015, Briefings Bioinform..

[21]  Benno Schwikowski,et al.  Graph-based methods for analysing networks in cell biology , 2006, Briefings Bioinform..

[22]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[23]  Chris Mungall,et al.  AmiGO: online access to ontology and annotation data , 2008, Bioinform..

[24]  Andrew Young,et al.  OntologyTraverser: an R package for GO analysis , 2005, Bioinform..

[25]  David Martin,et al.  GOToolBox: functional analysis of gene datasets based on Gene Ontology , 2004, Genome Biology.

[26]  Catia Pesquita,et al.  Semantic Similarity in the Gene Ontology. , 2017, Methods in molecular biology.

[27]  Lin Fang,et al.  WEGO: a web tool for plotting GO annotations , 2006, Nucleic Acids Res..

[28]  Brad T. Sherman,et al.  DAVID: Database for Annotation, Visualization, and Integrated Discovery , 2003, Genome Biology.

[29]  Christophe Dessimoz,et al.  The what, where, how and why of gene ontology—a primer for bioinformaticians , 2011, Briefings Bioinform..

[30]  Matthias Dehmer,et al.  Defining Data Science by a Data-Driven Quantification of the Community , 2018, Mach. Learn. Knowl. Extr..

[31]  angesichts der Corona-Pandemie,et al.  UPDATE , 1973, The Lancet.

[32]  Zhou Du,et al.  agriGO v2.0: a GO analysis toolkit for the agricultural community, 2017 update , 2017, Nucleic Acids Res..

[33]  S. Somiari,et al.  Functional relationship and gene ontology classification of breast cancer biomarkers. , 2003, The International journal of biological markers.

[34]  Robert Gentleman,et al.  Using GOstats to test gene lists for GO term association , 2007, Bioinform..

[35]  Gary D Bader,et al.  Enrichment Map: A Network-Based Method for Gene-Set Enrichment Visualization and Interpretation , 2010, PloS one.

[36]  D. Schaid,et al.  Using the gene ontology to scan multilevel gene sets for associations in genome wide association studies , 2012, Genetic epidemiology.

[37]  David Botstein,et al.  GO: : TermFinder--open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes , 2004, Bioinform..

[38]  Matthew D. Young,et al.  Gene ontology analysis for RNA-seq: accounting for selection bias , 2010, Genome Biology.

[39]  Nicola J. Mulder,et al.  Information Content-Based Gene Ontology Functional Similarity Measures: Which One to Use for a Given Biological Data Type? , 2014, PloS one.

[40]  Phillip W. Lord,et al.  Semantic Similarity in Biomedical Ontologies , 2009, PLoS Comput. Biol..

[41]  Andrew D. Rouillard,et al.  Enrichr: a comprehensive gene set enrichment analysis web server 2016 update , 2016, Nucleic Acids Res..

[42]  S. Dudoit,et al.  STATISTICAL METHODS FOR IDENTIFYING DIFFERENTIALLY EXPRESSED GENES IN REPLICATED cDNA MICROARRAY EXPERIMENTS , 2002 .

[43]  Xinghua Lu,et al.  GOGrapher: A Python library for GO graph representation and analysis , 2009, BMC Research Notes.

[44]  Rachael P. Huntley,et al.  QuickGO: a web-based tool for Gene Ontology searching , 2009, Bioinform..