kpath: integration of metabolic pathway linked data

In the last few years, the Life Sciences domain has experienced a rapid growth in the amount of available biological databases. The heterogeneity of these databases makes data integration a challenging issue. Some integration challenges are locating resources, relationships, data formats, synonyms or ambiguity. The Linked Data approach partially solves the heterogeneity problems by introducing a uniform data representation model. Linked Data refers to a set of best practices for publishing and connecting structured data on the Web. This article introduces kpath, a database that integrates information related to metabolic pathways. kpath also provides a navigational interface that enables not only the browsing, but also the deep use of the integrated data to build metabolic networks based on existing disperse knowledge. This user interface has been used to showcase relationships that can be inferred from the information available in several public databases. Database URL: The public Linked Data repository can be queried at http://sparql.kpath.khaos.uma.es using the graph URI “www.khaos.uma.es/metabolic-pathways-app”. The GUI providing navigational access to kpath database is available at http://browser.kpath.khaos.uma.es.

[1]  Lei Shi,et al.  SABIO-RK—database for biochemical reaction kinetics , 2011, Nucleic Acids Res..

[2]  Dean Ravenscroft,et al.  A genome scale metabolic network for rice and accompanying analysis of tryptophan, auxin and serotonin biosynthesis regulation under biotic stress , 2013, Rice.

[3]  Antje Chang,et al.  BRENDA, enzyme data and metabolic information , 2002, Nucleic Acids Res..

[4]  Gerbert A. Jansen,et al.  Critical assessment of human metabolic pathway databases: a stepping stone for future integration , 2011, BMC Systems Biology.

[5]  Gary D Bader,et al.  BioPAX – A community standard for pathway data sharing , 2010, Nature Biotechnology.

[6]  Tim Berners-Lee,et al.  Linked Data - The Story So Far , 2009, Int. J. Semantic Web Inf. Syst..

[7]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[8]  John N. Weinstein,et al.  Exposing the cancer genome atlas as a SPARQL endpoint , 2010, J. Biomed. Informatics.

[9]  Mirina Grosz,et al.  World Wide Web Consortium , 2010 .

[10]  Peter D. Karp,et al.  EcoCyc: a comprehensive database resource for Escherichia coli , 2004, Nucleic Acids Res..

[11]  Susumu Goto,et al.  Data, information, knowledge and principle: back to metabolism in KEGG , 2013, Nucleic Acids Res..

[12]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[13]  Nicole Redaschi UniProt in RDF: Tackling Data Integration and Distributed Annotation with the Semantic Web , 2009 .

[14]  Suzanne M. Paley,et al.  Beyond the genome (BTG) is a (PGDB) pathway genome database: HumanCyc , 2010, Genome Biology.

[15]  José Francisco Aldana Montes,et al.  Sharing and executing linked data queries in a collaborative environment , 2013, Bioinform..

[16]  António E. N. Ferreira,et al.  The glyoxalase pathway: the first hundred years... and beyond. , 2013, The Biochemical journal.

[17]  Rolf Apweiler,et al.  The SWISS-PROT protein sequence data bank and its supplement TrEMBL , 1997, Nucleic Acids Res..

[18]  Scott Federhen,et al.  The NCBI Taxonomy database , 2011, Nucleic Acids Res..

[19]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[20]  Csongor Nyulas,et al.  BioPortal: enhanced functionality via new Web services from the National Center for Biomedical Ontology to access and use ontologies in software applications , 2011, Nucleic Acids Res..

[21]  David S. Wishart,et al.  HMDB 3.0—The Human Metabolome Database in 2013 , 2012, Nucleic Acids Res..

[22]  C. Beecher,et al.  The Human Metabolome , 2003 .

[23]  Jing Gao,et al.  Metscape: a Cytoscape plug-in for visualizing and interpreting metabolomic data in the context of human metabolic networks , 2010, Bioinform..

[24]  O. Demin,et al.  The Edinburgh human metabolic network reconstruction and its functional analysis , 2007, Molecular systems biology.

[25]  Lincoln Stein,et al.  Reactome: a database of reactions, pathways and biological processes , 2010, Nucleic Acids Res..

[26]  H. Chandler Database , 1985 .

[27]  Jacky L. Snoep,et al.  BioModels Database: a free, centralized database of curated, published, quantitative kinetic models of biochemical and cellular systems , 2005, Nucleic Acids Res..

[28]  Peter D. Karp,et al.  MetaCyc and AraCyc. Metabolic Pathway Databases for Plant Research1[w] , 2005, Plant Physiology.

[29]  David S. Wishart,et al.  DrugBank: a knowledgebase for drugs, drug actions and drug targets , 2007, Nucleic Acids Res..

[30]  Tatiana A. Tatusova,et al.  Entrez Gene: gene-centered information at NCBI , 2004, Nucleic Acids Res..

[31]  Lloyd W. Sumner,et al.  MedicCyc: a biochemical pathway database for Medicago truncatula , 2007, Bioinform..

[32]  Rainer Breitling,et al.  TrypanoCyc: a community-led biochemical pathways database for Trypanosoma brucei , 2014, Nucleic Acids Res..

[33]  R. Goodacre,et al.  Metabolic Profiling: Its Role in Biomarker Discovery and Gene Function Analysis , 2003, Springer US.

[34]  Andrew M. Jenkinson,et al.  The EBI RDF platform: linked open data for the life sciences , 2014, Bioinform..

[35]  Nicole Tourigny,et al.  Bio2RDF: Towards a mashup to build bioinformatics knowledge systems , 2008, J. Biomed. Informatics.