Biomedical Semantic Resources for Drug Discovery Platforms

The biomedical research community is providing large-scale data sources to enable knowledge discovery from the data alone, or from novel scientific experiments in combination with the existing knowledge. Increasingly semantic Web technologies are being developed and used including ontologies, triple stores and combinations thereof. The amount of data is constantly increasing as well as the complexity of data. Since the data sources are publicly available, the amount of content can be measured giving an overview on the accessible content but also on the state of the data representation in comparison to the existing content. For a better understanding of the existing data resources, i.e. judgements on the distribution of data triples across concepts, data types and primary providers, we have performed a comprehensive analysis which delivers an overview on the accessible content for semantic Web solutions (from publicly accessible data servers). It can be derived that the information related to genes, proteins and chemical entities form the core, whereas the content related to diseases and pathways forms a smaller portion. As a result, any approach for drug discovery would profit from the data on molecular entities, but would lack content from data resources that represent disease pathomechanisms.

[1]  Stefan Decker,et al.  Linked Biomedical Dataspace: Lessons Learned Integrating Data for Drug Discovery , 2014, SEMWEB.

[2]  D. Rebholz-Schuhmann,et al.  Text-mining solutions for biomedical research: enabling integrative biology , 2012, Nature Reviews Genetics.

[3]  R A Greenes,et al.  The findings--diagnosis continuum: implications for image descriptions and clinical databases. , 1992, Proceedings. Symposium on Computer Applications in Medical Care.

[4]  Nicole Tourigny,et al.  Bio2RDF: Towards a mashup to build bioinformatics knowledge systems , 2008, J. Biomed. Informatics.

[5]  Dietrich Rebholz-Schuhmann,et al.  BioFed: federated query processing over life sciences linked open data , 2017, J. Biomed. Semant..

[6]  Dietrich Rebholz-Schuhmann,et al.  Evaluation and Cross-Comparison of Lexical Entities of Biological Interest (LexEBI) , 2013, PloS one.

[7]  Aidan Hogan,et al.  SPORTAL: Profiling the Content of Public SPARQL Endpoints , 2016, Int. J. Semantic Web Inf. Syst..

[8]  Michel Dumontier,et al.  Towards quantitative measures in applied ontology , 2012, ArXiv.

[9]  Tim Berners-Lee,et al.  Linked Data - The Story So Far , 2009, Int. J. Semantic Web Inf. Syst..

[10]  Stefan Decker,et al.  FedViz: A Visual Interface for SPARQL Queries Formulation and Execution , 2015, VOILA@ISWC.

[11]  Stefan Decker,et al.  A Roadmap for Navigating the Life Sciences Linked Open Data Cloud , 2014, JIST.

[12]  Aidan Hogan,et al.  SPORTAL: Searching for Public SPARQL Endpoints , 2016, International Semantic Web Conference.

[13]  M A Musen,et al.  Dimensions of knowledge sharing and reuse. , 1992, Computers and biomedical research, an international journal.

[14]  Stefan Decker,et al.  Cataloguing and Linking Life Sciences LOD Cloud , 2009 .

[15]  Dietrich Rebholz-Schuhmann,et al.  The semantic web in translational medicine: current applications and future directions , 2013, Briefings Bioinform..

[16]  Graeme Hirst,et al.  Ontology and the Lexicon , 2004, Handbook on Ontologies.

[17]  Syed Muhammad Ali Hasnain,et al.  Cataloguing and linking publicly available biomedical SPARQL endpoints for federation - addressing aPosteriori data integration , 2017 .

[18]  Karin Schwab Computer Methods For Macromolecular Sequence Analysis , 2016 .

[19]  Andrea Splendiani,et al.  Knowledge sharing and collaboration in translational research, and the DC-THERA Directory , 2011, Briefings Bioinform..

[20]  Alexander R. Pico,et al.  WikiPathways: Pathway Editing for the People , 2008, PLoS biology.

[21]  Denis E. Corpet,et al.  Most Effective Colon Cancer Chemopreventive Agents in Rats: A Systematic Review of Aberrant Crypt Foci and Tumor Data, Ranked by Potency , 2002, Nutrition and cancer.

[22]  Dietrich Rebholz-Schuhmann,et al.  Use of Shared Lexical Resources for Efficient Ontological Engineering , 2008, SWAT4LS.

[23]  Peter Woollard,et al.  A case study: semantic integration of gene-disease associations for type 2 diabetes mellitus from literature and biomedical data resources. , 2014, Drug discovery today.

[24]  David S. Wishart,et al.  DrugBank: a knowledgebase for drugs, drug actions and drug targets , 2007, Nucleic Acids Res..