An Analysis of Differences in Biological Pathway Resources

Integrating content from multiple biological pathway resources is necessary to fully exploit pathway knowledge for the benefit of biology and medicine. Differences in content, representation, coverage, and more occur between databases, and are challenges to resource merging. We introduce a typology of representational differences between pathway resources and give examples using several databases: BioCyc, KEGG, PANTHER pathways, and Reactome. We also detect and quantify annotation mismatches between HumanCyc and Reactome. The typology of mismatches can be used to guide entity and relationship alignment between these databases, helping us identify and understand deficiencies in our knowledge, and allowing the research community to derive greater benefit from the existing pathway data. Keywords—pathway database, knowledge representation, resource comparison

[1]  Gary D Bader,et al.  BioPAX – A community standard for pathway data sharing , 2010, Nature Biotechnology.

[2]  Hongfei Lin,et al.  Integrating Various Resources for Gene Name Normalization , 2012, PloS one.

[3]  Sanghyuk Lee,et al.  hiPathDB: a human-integrated pathway database with facile visualization , 2011, Nucleic Acids Res..

[4]  Gary D. Bader,et al.  Pathguide: a Pathway Resource List , 2005, Nucleic Acids Res..

[5]  Ryan Miller,et al.  WikiPathways: capturing the full diversity of pathway knowledge , 2015, Nucleic Acids Res..

[6]  Anne Morgat,et al.  UniPathway: a resource for the exploration and annotation of metabolic pathways , 2011, Nucleic Acids Res..

[7]  M. Campbell,et al.  PANTHER: a library of protein families and subfamilies indexed by function. , 2003, Genome research.

[8]  Henning Hermjakob,et al.  The Reactome pathway knowledgebase , 2013, Nucleic Acids Res..

[9]  Julio Saez-Rodriguez,et al.  BioServices: a common Python package to access biological Web Services programmatically , 2013, Bioinform..

[10]  Peter D. Karp,et al.  A systematic comparison of the MetaCyc and KEGG pathway databases , 2013, BMC Bioinformatics.

[11]  Ali Shojaie,et al.  Using random walks to identify cancer-associated modules in expression data , 2013, BioData Mining.

[12]  Tamer Kahveci,et al.  SubMAP: Aligning Metabolic Pathways with Subnetwork Mappings , 2010, J. Comput. Biol..

[13]  Yike Guo,et al.  Consistency, comprehensiveness, and compatibility of pathway databases , 2010, BMC Bioinformatics.

[14]  Hiroaki Kitano,et al.  The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models , 2003, Bioinform..

[15]  Ram Rup Sarkar,et al.  Comparison of human cell signaling pathway databases—evolution, drawbacks and challenges , 2015, Database J. Biol. Databases Curation.

[16]  Hiroyuki Ogata,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 1999, Nucleic Acids Res..

[17]  Ralf Herwig,et al.  ConsensusPathDB—a database for integrating human functional interaction networks , 2008, Nucleic Acids Res..

[18]  Marta Simeoni,et al.  MP-Align: alignment of metabolic pathways , 2014, BMC Systems Biology.

[19]  Henning Hermjakob,et al.  R spider: a network-based analysis of gene lists by combining signaling and metabolic pathways from Reactome and KEGG databases , 2010, Nucleic Acids Res..

[20]  Gerbert A. Jansen,et al.  Knowledge representation in metabolic pathway databases , 2014, Briefings Bioinform..

[21]  Gerbert A. Jansen,et al.  Critical assessment of human metabolic pathway databases: a stepping stone for future integration , 2011, BMC Systems Biology.

[22]  C. Sander,et al.  The HUPO PSI's Molecular Interaction format—a community standard for the representation of protein interaction data , 2004, Nature Biotechnology.

[23]  Henning Hermjakob,et al.  The Reactome pathway Knowledgebase , 2015, Nucleic acids research.

[24]  Gary D. Bader,et al.  Pathway Commons, a web resource for biological pathway data , 2010, Nucleic Acids Res..

[25]  Egon L. Willighagen,et al.  The Chemical Translation Service—a web-based tool to improve standardization of metabolomic reports , 2010, Bioinform..

[26]  P. Karp,et al.  Computational prediction of human metabolic pathways from the complete human genome , 2004, Genome Biology.