RITAN: rapid integration of term annotation and network resources

Background Identifying the biologic functions of groups of genes identified in high-throughput studies currently requires considerable time and/or bioinformatics experience. This is due in part to each resource housed within separate databases, requiring users to know about them, and integrate across them. Time consuming and often repeated for each study, integrating across resources and merging with data under study is an increasingly common bioinformatics task. Methods We developed an open-source R software package for assisting researchers in annotating their genesets with functions, pathways, and their interconnectivity across a diversity of network resources. Results We present rapid integration of term annotation and network resources (RITAN) for the rapid and comprehensive annotation of a list of genes using functional term and pathway resources and their relationships among each other using multiple network biology resources. Currently, and to comply with data redistribution policies, RITAN allows rapid access to 16 term annotations spanning gene ontology, biologic pathways, and immunologic modules, and nine network biology resources, with support for user-supplied resources; we provide recommendations for additional resources and scripts to facilitate their addition to RITAN. Having the resources together in the same system allows users to derive novel combinations. RITAN has a growing set of tools to explore the relationships within resources themselves. These tools allow users to merge resources together such that the merged annotations have a minimal overlap with one another. Because we index both function annotation and network interactions, the combination allows users to expand small groups of genes using links from biologic networks—either by adding all neighboring genes or by identifying genes that efficiently connect among input genes—followed by term enrichment to identify functions. That is, users can start from a core set of genes, identify interacting genes from biologic networks, and then identify the functions to which the expanded list of genes contribute. Conclusion We believe RITAN fills the important niche of bridging the results of high-throughput experiments with the ever-growing corpus of functional annotations and network biology resources. Availability Rapid integration of term annotation and network resources is available as an R package at github.com/MTZimmer/RITAN and BioConductor.org.

[1]  Susumu Goto,et al.  KEGG for integration and interpretation of large-scale molecular data sets , 2011, Nucleic Acids Res..

[2]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Michael T. Zimmermann,et al.  Transcriptional signatures of influenza A/H1N1-specific IgG memory-like B cell response in older individuals. , 2016, Vaccine.

[4]  Peter J. Woolf,et al.  GAGE: generally applicable gene set enrichment for pathway analysis , 2009, BMC Bioinformatics.

[5]  Michael T. Zimmermann,et al.  Immunosenescence-Related Transcriptomic and Immunologic Changes in Older Individuals Following Influenza Vaccination , 2016, Front. Immunol..

[6]  Jing Wang,et al.  WEB-based GEne SeT AnaLysis Toolkit (WebGestalt): update 2013 , 2013, Nucleic Acids Res..

[7]  Damian Szklarczyk,et al.  The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored , 2010, Nucleic Acids Res..

[8]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[9]  Lincoln Stein,et al.  Reactome: a database of reactions, pathways and biological processes , 2010, Nucleic Acids Res..

[10]  Sandra Romero-Steiner,et al.  Molecular signatures of antibody responses derived from a systems biology study of five human vaccines , 2022 .

[11]  Michael T. Zimmermann,et al.  Whole Transcriptome Profiling Identifies CD93 and Other Plasma Cell Survival Factor Genes Associated with Measles-Specific Antibody Response after Vaccination , 2016, PloS one.

[12]  Hans-Werner Mewes,et al.  CORUM: the comprehensive resource of mammalian protein complexes , 2007, Nucleic Acids Res..

[13]  Sandra Romero-Steiner,et al.  Molecular signatures of antibody responses derived from a systems biological study of 5 human vaccines , 2013, Nature Immunology.

[14]  Li Ding,et al.  Perspective on Oncogenic Processes at the End of the Beginning of Cancer Genomics , 2018, Cell.

[15]  Xingyi Zhang,et al.  Overlapping Community Detection based on Network Decomposition , 2016, Scientific Reports.

[16]  Michael T. Zimmermann,et al.  System-Wide Associations between DNA-Methylation, Gene Expression, and Humoral Immune Response to Influenza Vaccination , 2016, PloS one.

[17]  Gábor Csárdi,et al.  The igraph software package for complex network research , 2006 .

[18]  Rong Chen,et al.  Identify Cancer Driver Genes Through Shared Mendelian Disease Pathogenic Variants and Cancer Somatic Mutations , 2017, PSB.

[19]  J. Meng,et al.  Role of SNARE proteins in tumourigenesis and their potential as targets for novel anti-cancer therapeutics. , 2015, Biochimica et biophysica acta.

[20]  Bridget E. Begg,et al.  A Proteome-Scale Map of the Human Interactome Network , 2014, Cell.

[21]  Gary D Bader,et al.  NetPath: a public resource of curated signal transduction pathways , 2010, Genome Biology.

[22]  Akhilesh Pandey,et al.  Human Protein Reference Database and Human Proteinpedia as discovery tools for systems biology. , 2009, Methods in molecular biology.

[23]  Núria Queralt-Rosinach,et al.  DisGeNET: a discovery platform for the dynamical exploration of human diseases and their genes , 2015, Database J. Biol. Databases Curation.

[24]  Brad T. Sherman,et al.  Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources , 2008, Nature Protocols.

[25]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[26]  Davis J. McCarthy,et al.  Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation , 2012, Nucleic acids research.

[27]  Jaana M. Hartikainen,et al.  Inherited variants in the inner centromere protein (INCENP) gene of the chromosomal passenger complex contribute to the susceptibility of ER-negative breast cancer. , 2015, Carcinogenesis.

[28]  Lih-Ling Lin,et al.  Innate Immune Responses to TREM-1 Activation: Overlap, Divergence, and Positive and Negative Cross-Talk with Bacterial Lipopolysaccharide , 2008, The Journal of Immunology.

[29]  Kenneth H. Buetow,et al.  PID: the Pathway Interaction Database , 2008, Nucleic Acids Res..

[30]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .