A brief survey of tools for genomic regions enrichment analysis

Functional enrichment analysis or pathway enrichment analysis (PEA) is a bioinformatics technique which identifies the most over-represented biological pathways in a list of genes compared to those that would be associated with them by chance. These biological functions are found on bioinformatics annotated databases such as The Gene Ontology or KEGG; the more abundant pathways are identified through statistical techniques such as Fisher’s exact test. All PEA tools require a list of genes as input. A few tools, however, read lists of genomic regions as input rather than lists of genes, and first associate these chromosome regions with their corresponding genes. These tools perform a procedure called genomic regions enrichment analysis, which can be useful for detecting the biological pathways related to a set of chromosome regions. In this brief survey, we analyze six tools for genomic regions enrichment analysis (BEHST, g:Profiler g:GOSt, GREAT, LOLA, Poly-Enrich, and ReactomePA), outlining and comparing their main features. Our comparison results indicate that the inclusion of data for regulatory elements, such as ChIP-seq, is common among these tools and could therefore improve the enrichment analysis results.

[1]  D. Chicco,et al.  Nine quick tips for pathway enrichment analysis , 2022, PLoS Comput. Biol..

[2]  Brad T. Sherman,et al.  DAVID: a web server for functional enrichment analysis and functional annotation of gene lists (2021 update) , 2022, Nucleic Acids Res..

[3]  M. Ziemann,et al.  Urgent need for consistent standards in functional enrichment analysis , 2022, PLoS Comput. Biol..

[4]  M. Hofmann-Apitius,et al.  On the influence of several factors on pathway enrichment analysis , 2022, Briefings Bioinform..

[5]  Gary D Bader,et al.  The reactome pathway knowledgebase 2022 , 2021, Nucleic Acids Res..

[6]  J. Vilo,et al.  gprofiler2 -- an R package for gene list functional enrichment analysis and namespace conversion toolset g:Profiler , 2020, F1000Research.

[7]  J. Vilo,et al.  g:Profiler: a web server for functional enrichment analysis and conversions of gene lists (2019 update) , 2019, Nucleic Acids Res..

[8]  Gary D Bader,et al.  Pathway enrichment analysis and visualization of omics data using g:Profiler, GSEA, Cytoscape and EnrichmentMap , 2019, Nature Protocols.

[9]  D. Chicco,et al.  BEHST: genomic set enrichment analysis enhanced through integration of chromatin long-range interactions , 2019, bioRxiv.

[10]  Raymond G. Cavalcante,et al.  Poly-Enrich: count-based methods for gene set enrichment testing with genomic regions , 2018, bioRxiv.

[11]  Brent S. Pedersen,et al.  Bioconda: sustainable and comprehensive software distribution for the life sciences , 2018, Nature Methods.

[12]  Nathan C. Sheffield,et al.  LOLAweb: a containerized web server for interactive genomic locus overlap enrichment analysis , 2018, Nucleic Acids Res..

[13]  Christoph Plass,et al.  Enrichment analysis with EpiAnnotator , 2018, Bioinform..

[14]  Christopher D. Chambers,et al.  Redefine statistical significance , 2017, Nature Human Behaviour.

[15]  The Fantom Consortium,et al.  On-the-fly selection of cell-specific enhancers, genes, miRNAs and proteins across the human body using SlideBase , 2016, Database J. Biol. Databases Curation.

[16]  Hedi Peterson,et al.  g:Profiler—a web server for functional interpretation of gene lists (2016 update) , 2016, Nucleic Acids Res..

[17]  Guangchuang Yu,et al.  ReactomePA: an R/Bioconductor package for reactome pathway analysis and visualization. , 2016, Molecular bioSystems.

[18]  Nathan C. Sheffield,et al.  LOLA: enrichment analysis for genomic region sets and regulatory elements in R and Bioconductor , 2015, Bioinform..

[19]  Marco Masseroli,et al.  Software Suite for Gene and Protein Annotation Prediction and Similarity Search , 2015, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[20]  Guangchuang Yu,et al.  DOSE: an R/Bioconductor package for disease ontology semantic and enrichment analysis , 2015, Bioinform..

[21]  Sylvia Tippmann,et al.  Programming tools: Adventures with R , 2014, Nature.

[22]  Neva C. Durand,et al.  A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping , 2014, Cell.

[23]  Patrick Lombard,et al.  CODEX: a next-generation sequencing experiment database for the haematopoietic and embryonic stem cell communities , 2014, Nucleic Acids Res..

[24]  Laura J. Scott,et al.  Broad-Enrich: functional interpretation of large sets of broad genomic regions , 2014, Bioinform..

[25]  Laura J. Scott,et al.  ChIP-Enrich: gene set enrichment testing for ChIP-seq data , 2014, Nucleic acids research.

[26]  T. Meehan,et al.  An atlas of active enhancers across human cell types and tissues , 2014, Nature.

[27]  Marco Masseroli,et al.  Enhanced probabilistic latent semantic analysis with weighting schemes to predict genomic annotations , 2013, 13th IEEE International Conference on BioInformatics and BioEngineering.

[28]  Boris Lenhard,et al.  Patterns of regulatory activity across diverse human cell types predict tissue identity, transcription factor binding, and long-range interactions , 2013, Genome research.

[29]  Kurt Hornik,et al.  The Comprehensive R Archive Network , 2012 .

[30]  Colm O'Dushlaine,et al.  INRICH: interval-based enrichment analysis for genome-wide association studies , 2012, Bioinform..

[31]  Jaak Vilo,et al.  g:Profiler—a web server for functional interpretation of gene lists (2011 update) , 2011, Nucleic Acids Res..

[32]  Cory Y. McLean,et al.  GREAT improves functional interpretation of cis-regulatory regions , 2010, Nature Biotechnology.

[33]  E. Birney,et al.  Genome browsing with Ensembl: a practical overview. , 2007, Briefings in functional genomics & proteomics.

[34]  E. Mardis ChIP-seq: welcome to the new frontier , 2007, Nature Methods.

[35]  Hedi Peterson,et al.  g:Profiler—a web-based toolset for functional profiling of gene lists from large-scale experiments , 2007, Nucleic Acids Res..

[36]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[37]  OUP accepted manuscript , 2022, Nucleic Acids Research.

[38]  C. Lottaz,et al.  BIOINFORMATICS APPLICATIONS NOTE , 2001 .