Automatically visualise and analyse data on pathways using PathVisioRPC from any programming environment

BackgroundBiological pathways are descriptive diagrams of biological processes widely used for functional analysis of differentially expressed genes or proteins. Primary data analysis, such as quality control, normalisation, and statistical analysis, is often performed in scripting languages like R, Perl, and Python. Subsequent pathway analysis is usually performed using dedicated external applications. Workflows involving manual use of multiple environments are time consuming and error prone. Therefore, tools are needed that enable pathway analysis directly within the same scripting languages used for primary data analyses. Existing tools have limited capability in terms of available pathway content, pathway editing and visualisation options, and export file formats. Consequently, making the full-fledged pathway analysis tool PathVisio available from various scripting languages will benefit researchers.ResultsWe developed PathVisioRPC, an XMLRPC interface for the pathway analysis software PathVisio. PathVisioRPC enables creating and editing biological pathways, visualising data on pathways, performing pathway statistics, and exporting results in several image formats in multiple programming environments.We demonstrate PathVisioRPC functionalities using examples in Python. Subsequently, we analyse a publicly available NCBI GEO gene expression dataset studying tumour bearing mice treated with cyclophosphamide in R. The R scripts demonstrate how calls to existing R packages for data processing and calls to PathVisioRPC can directly work together. To further support R users, we have created RPathVisio simplifying the use of PathVisioRPC in this environment. We have also created a pathway module for the microarray data analysis portal ArrayAnalysis.org that calls the PathVisioRPC interface to perform pathway analysis. This module allows users to use PathVisio functionality online without having to download and install the software and exemplifies how the PathVisioRPC interface can be used by data analysis pipelines for functional analysis of processed genomics data.ConclusionsPathVisioRPC enables data visualisation and pathway analysis directly from within various analytical environments used for preliminary analyses. It supports the use of existing pathways from WikiPathways or pathways created using the RPC itself. It also enables automation of tasks performed using PathVisio, making it useful to PathVisio users performing repeated visualisation and analysis tasks. PathVisioRPC is freely available for academic and commercial use at http://projects.bigcat.unimaas.nl/pathvisiorpc.

[1]  L. Stein,et al.  Annotating Cancer Variants and Anti-Cancer Therapeutics in Reactome , 2012, Cancers.

[2]  Falk Schreiber,et al.  VANTED: A system for advanced data analysis and visualization in the context of biological networks , 2006, BMC Bioinformatics.

[3]  W. Cleveland LOWESS: A Program for Smoothing Scatterplots by Robust Locally Weighted Regression , 1981 .

[4]  Gary D Bader,et al.  BioPAX – A community standard for pathway data sharing , 2010, Nature Biotechnology.

[5]  Alexander R. Pico,et al.  Mining Biological Pathways Using WikiPathways Web Services , 2009, PloS one.

[6]  Steven C. Lawlor,et al.  MAPPFinder: using Gene Ontology and GenMAPP to create a global gene-expression profile from microarray data , 2003, Genome Biology.

[7]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[8]  Stefan Wiemann,et al.  KEGGgraph: a graph approach to KEGG PATHWAY in R and bioconductor , 2009, Bioinform..

[9]  Susumu Goto,et al.  Data, information, knowledge and principle: back to metabolism in KEGG , 2013, Nucleic Acids Res..

[10]  Chris T. A. Evelo,et al.  Integrated Visualization of a Multi-omics Study of Starvation in Mouse Intestine , 2014, J. Integr. Bioinform..

[11]  Weijun Luo,et al.  Pathview: an R/Bioconductor package for pathway-based data integration and visualization , 2013, Bioinform..

[12]  Gary D. Bader,et al.  Pathguide: a Pathway Resource List , 2005, Nucleic Acids Res..

[13]  Bartek Wilczynski,et al.  Biopython: freely available Python tools for computational molecular biology and bioinformatics , 2009, Bioinform..

[14]  Chris T. A. Evelo,et al.  Presenting and exploring biological pathways with PathVisio , 2008, BMC Bioinformatics.

[15]  A. Bauer-Mehren,et al.  Pathway databases and tools for their exploitation: benefits, current limitations and challenges , 2009, Molecular systems biology.

[16]  Peter D. Karp,et al.  Pathway Tools version 13.0: integrated software for pathway/genome informatics and systems biology , 2015, Briefings Bioinform..

[17]  Sean R. Davis,et al.  NCBI GEO: archive for functional genomics data sets—update , 2012, Nucleic Acids Res..

[18]  M. Kanehisa,et al.  The KEGG databases and tools facilitating omics analysis: latest developments involving human diseases and pharmaceuticals. , 2012, Methods in molecular biology.

[19]  Atul J. Butte,et al.  Ten Years of Pathway Analysis: Current Approaches and Outstanding Challenges , 2012, PLoS Comput. Biol..

[20]  Chris T. A. Evelo,et al.  User-friendly solutions for microarray quality control and pre-processing on ArrayAnalysis.org , 2013, Nucleic Acids Res..

[21]  Michael G. Katze,et al.  Into the Eye of the Cytokine Storm , 2012, Microbiology and Molecular Reviews.

[22]  Augustin Luna,et al.  PathVisio-MIM: PathVisio plugin for creating and editing Molecular Interaction Maps (MIMs) , 2011, Bioinform..

[23]  Chris T. A. Evelo,et al.  WikiPathways: building research communities on biological pathways , 2011, Nucleic Acids Res..

[24]  Sean R. Davis,et al.  GEOquery: a bridge between the Gene Expression Omnibus (GEO) and BioConductor , 2007, Bioinform..

[25]  Simon St. Laurent,et al.  Programming Web Services With XML-RPC , 2001 .

[26]  Samuel W. Cushman,et al.  Cellularity and Adipogenic Profile of the Abdominal Subcutaneous Adipose Tissue From Obese Adolescents: Association With Insulin Resistance and Hepatic Steatosis , 2010, Diabetes.

[27]  E. Proietti,et al.  Unraveling cancer chemoimmunotherapy mechanisms by gene and protein expression profiling of responses to cyclophosphamide. , 2011, Cancer research.

[28]  David Croft,et al.  Building models using Reactome pathways as templates. , 2013, Methods in molecular biology.

[29]  Philippe Rocca-Serra,et al.  Challenges of molecular nutrition research 6: the nutritional phenotype database to store, share and evaluate nutritional systems biology studies , 2010, Genes & Nutrition.

[30]  Steven Elliott,et al.  Proteomic analysis of acquired tamoxifen resistance in MCF-7 cells reveals expression signatures associated with enhanced migration , 2012, Breast Cancer Research.

[31]  Chris T. A. Evelo,et al.  The BridgeDb framework: standardized access to gene, protein and metabolite identifier mapping services , 2010, BMC Bioinformatics.

[32]  Gary D Bader,et al.  NetPath: a public resource of curated signal transduction pathways , 2010, Genome Biology.

[33]  Rafael A. Irizarry,et al.  Bioinformatics and Computational Biology Solutions using R and Bioconductor , 2005 .

[34]  Kenneth H. Buetow,et al.  PID: the Pathway Interaction Database , 2008, Nucleic Acids Res..

[35]  Singh Jitendra,et al.  A comprehensive molecular interaction map for Hepatitis B virus and drug designing of a novel inhibitor for Hepatitis B X protein , 2011, Bioinformation.

[36]  Gordon K. Smyth,et al.  limma: Linear Models for Microarray Data , 2005 .

[37]  Augustin Luna,et al.  PathVisio-Validator: a rule-based validation plugin for graphical pathway notations , 2012, Bioinform..

[38]  Alexander R. Pico,et al.  WikiPathways App for Cytoscape : Making biological pathways amenable to network analysis and visualization , 2018 .

[39]  Augustin Luna,et al.  PathVisio-Faceted Search: an exploration tool for multi-dimensional navigation of large pathways , 2013, Bioinform..

[40]  Hannelore Daniel,et al.  Alterations in hepatic one-carbon metabolism and related pathways following a high-fat dietary intervention. , 2011, Physiological genomics.

[41]  Matthew R. Pocock,et al.  The Bioperl toolkit: Perl modules for the life sciences. , 2002, Genome research.

[42]  C. Ouzounis,et al.  Expansion of the BioCyc collection of pathway/genome databases to 160 genomes , 2005, Nucleic acids research.

[43]  Kimberly Van Auken,et al.  WormBase 2014: new views of curated biology , 2013, Nucleic Acids Res..

[44]  Jörg Rahnenführer,et al.  Robert Gentleman, Vincent Carey, Wolfgang Huber, Rafael Irizarry, Sandrine Dudoit (2005): Bioinformatics and Computational Biology Solutions Using R and Bioconductor , 2009 .

[45]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[46]  Gary D. Bader,et al.  Pathway Commons, a web resource for biological pathway data , 2010, Nucleic Acids Res..