g:Profiler: a web server for functional enrichment analysis and conversions of gene lists (2019 update)

Abstract Biological data analysis often deals with lists of genes arising from various studies. The g:Profiler toolset is widely used for finding biological categories enriched in gene lists, conversions between gene identifiers and mappings to their orthologs. The mission of g:Profiler is to provide a reliable service based on up-to-date high quality data in a convenient manner across many evidence types, identifier spaces and organisms. g:Profiler relies on Ensembl as a primary data source and follows their quarterly release cycle while updating the other data sources simultaneously. The current update provides a better user experience due to a modern responsive web interface, standardised API and libraries. The results are delivered through an interactive and configurable web design. Results can be downloaded as publication ready visualisations or delimited text files. In the current update we have extended the support to 467 species and strains, including vertebrates, plants, fungi, insects and parasites. By supporting user uploaded custom GMT files, g:Profiler is now capable of analysing data from any organism. All past releases are maintained for reproducibility and transparency. The 2019 update introduces an extensive technical rewrite making the services faster and more flexible. g:Profiler is freely available at https://biit.cs.ut.ee/gprofiler.

[1]  Gary D Bader,et al.  Pathway enrichment analysis and visualization of omics data using g:Profiler, GSEA, Cytoscape and EnrichmentMap , 2019, Nature Protocols.

[2]  Tudor Groza,et al.  Expansion of the Human Phenotype Ontology (HPO) knowledge base and resources , 2018, Nucleic Acids Res..

[3]  Astrid Gall,et al.  Ensembl 2019 , 2018, Nucleic Acids Res..

[4]  The UniProt Consortium,et al.  UniProt: a worldwide hub of protein knowledge , 2018, Nucleic Acids Res..

[5]  Andreas Ruepp,et al.  CORUM: the comprehensive resource of mammalian protein complexes—2019 , 2018, Nucleic Acids Res..

[6]  Minoru Kanehisa,et al.  New approach for understanding genome variations in KEGG , 2018, Nucleic Acids Res..

[7]  J. Vilo,et al.  funcExplorer: a tool for fast data-driven functional characterisation of high-throughput expression data , 2018, BMC Genomics.

[8]  Laurent Gil,et al.  Ensembl variation resources , 2018, Database J. Biol. Databases Curation.

[9]  Marius van den Beek,et al.  The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update , 2018, Nucleic Acids Res..

[10]  Alejandro Correa,et al.  Gene expression analysis of human adipose tissue-derived stem cells during the initial steps of in vitro osteogenesis , 2018, Scientific Reports.

[11]  Thawfeek M. Varusai,et al.  The Reactome Pathway Knowledgebase , 2017, Nucleic acids research.

[12]  Ryan Miller,et al.  WikiPathways: a multifaceted pathway database bridging metabolomics to other omics research , 2017, Nucleic Acids Res..

[13]  Hsien-Da Huang,et al.  miRTarBase update 2018: a resource for experimentally validated microRNA-target interactions , 2017, Nucleic Acids Res..

[14]  Kevin L. Howe,et al.  WormBase ParaSite − a comprehensive resource for helminth genomics , 2017, Molecular and biochemical parasitology.

[15]  Zhou Du,et al.  agriGO v2.0: a GO analysis toolkit for the agricultural community, 2017 update , 2017, Nucleic Acids Res..

[16]  Jing Wang,et al.  WebGestalt 2017: a more comprehensive, powerful, flexible and interactive gene set enrichment analysis toolkit , 2017, Nucleic Acids Res..

[17]  Núria Queralt-Rosinach,et al.  DisGeNET: a comprehensive platform integrating information on human disease-associated genes and variants , 2016, Nucleic Acids Res..

[18]  Lincoln D. Stein,et al.  Impact of outdated gene annotations on pathway enrichment analysis , 2016, Nature Methods.

[19]  Andrew D. Rouillard,et al.  Enrichr: a comprehensive gene set enrichment analysis web server 2016 update , 2016, Nucleic Acids Res..

[20]  Hedi Peterson,et al.  g:Profiler—a web server for functional interpretation of gene lists (2016 update) , 2016, Nucleic Acids Res..

[21]  Owen Kaser,et al.  Consistently faster and smaller compressed bitmaps with Roaring , 2016, Softw. Pract. Exp..

[22]  Gwendolyn M. Jang,et al.  Meta- and Orthogonal Integration of Influenza "OMICs" Data Defines a Role for UBR4 in Virus Budding. , 2015, Cell host & microbe.

[23]  Karen Eilbeck,et al.  Improving the Sequence Ontology terminology for genomic variant annotation , 2015, Journal of Biomedical Semantics.

[24]  Jaak Vilo,et al.  ClustVis: a web tool for visualizing clustering of multivariate data using Principal Component Analysis and heatmap , 2015, Nucleic Acids Res..

[25]  G. von Heijne,et al.  Tissue-based map of the human proteome , 2015, Science.

[26]  R. Altman,et al.  Pharmacogenomics Knowledge for Personalized Medicine , 2012, Clinical pharmacology and therapeutics.

[27]  Atul J. Butte,et al.  Ten Years of Pathway Analysis: Current Approaches and Outstanding Challenges , 2012, PLoS Comput. Biol..

[28]  Syed Haider,et al.  Ensembl BioMarts: a hub for data retrieval across taxonomic space , 2011, Database J. Biol. Databases Curation.

[29]  Matko Bosnjak,et al.  REVIGO Summarizes and Visualizes Long Lists of Gene Ontology Terms , 2011, PloS one.

[30]  Chuan-Yun Li,et al.  KOBAS 2.0: a web server for annotation and identification of enriched pathways and diseases , 2011, Nucleic Acids Res..

[31]  Jaak Vilo,et al.  g:Profiler—a web server for functional interpretation of gene lists (2011 update) , 2011, Nucleic Acids Res..

[32]  Helga Thorvaldsdóttir,et al.  Molecular signatures database (MSigDB) 3.0 , 2011, Bioinform..

[33]  Hans-Werner Mewes,et al.  CORUM: the comprehensive resource of mammalian protein complexes , 2007, Nucleic Acids Res..

[34]  R. Kolde,et al.  Mining for coexpression across hundreds of datasets using novel rank aggregation and visualization methods , 2009, Genome Biology.

[35]  Hedi Peterson,et al.  g:Profiler—a web-based toolset for functional profiling of gene lists from large-scale experiments , 2007, Nucleic Acids Res..

[36]  Alexander E. Kel,et al.  TRANSFAC® and its module TRANSCompel®: transcriptional gene regulation in eukaryotes , 2005, Nucleic Acids Res..

[37]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[38]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[39]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.