Integration of Scholarly Communication Metadata Using Knowledge Graphs

Important questions about the scientific community, e.g., what authors are the experts in a certain field, or are actively engaged in international collaborations, can be answered using publicly available datasets. However, data required to answer such questions is often scattered over multiple isolated datasets. Recently, the Knowledge Graph (KG) concept has been identified as a means for interweaving heterogeneous datasets and enhancing answer completeness and soundness. We present a pipeline for creating high quality knowledge graphs that comprise data collected from multiple isolated structured datasets. As proof of concept, we illustrate the different steps in the construction of a knowledge graph in the domain of scholarly communication metadata (SCM-KG). Particularly, we demonstrate the benefits of exploiting semantic web technology to reconcile data about authors, papers, and conferences. We conducted an experimental study on an SCM-KG that merges scientific research metadata from the DBLP bibliographic source and the Microsoft Academic Graph. The observed results provide evidence that queries are processed more effectively on top of the SCM-KG than over the isolated datasets, while execution time is not negatively affected.

[1]  Craig A. Knoblock,et al.  Using a Knowledge Graph to Combat Human Trafficking , 2015, SEMWEB.

[2]  Yang Song,et al.  An Overview of Microsoft Academic Service (MAS) and Applications , 2015, WWW.

[3]  Guillermo Palma,et al.  Considering Semantics on the Discovery of Relations in Knowledge Graphs , 2016, EKAW.

[4]  Jens Lehmann,et al.  Simplified RDB2RDF Mapping , 2015, LDOW@WWW.

[5]  Wei Zhang,et al.  Knowledge vault: a web-scale approach to probabilistic knowledge fusion , 2014, KDD.

[6]  Khushbu Agarwal,et al.  NOUS: Construction and Querying of Dynamic Knowledge Graphs , 2016, 2017 IEEE 33rd International Conference on Data Engineering (ICDE).

[7]  Arjan Durresi,et al.  A survey: Control plane scalability issues and approaches in Software-Defined Networking (SDN) , 2017, Comput. Networks.

[8]  Andrea Giovanni Nuzzolese,et al.  Semantic Web Conference Ontology - A Refactoring Solution , 2016, ESWC.

[9]  Hoan Quoc Nguyen-Mau,et al.  The Graph of Things: A step towards the Live Knowledge Graph of connected things , 2016, J. Web Semant..

[10]  Stefanos D. Kollias,et al.  A String Metric for Ontology Alignment , 2005, SEMWEB.

[11]  Sören Auer,et al.  User-driven semantic mapping of tabular data , 2013, I-SEMANTICS '13.

[12]  Heiko Paulheim,et al.  Knowledge graph refinement: A survey of approaches and evaluation methods , 2016, Semantic Web.

[13]  Wolfram Wöß,et al.  Towards a Definition of Knowledge Graphs , 2016, SEMANTiCS.

[14]  Nikolas Mitrou,et al.  Bringing relational databases into the Semantic Web: A survey , 2012, Semantic Web.

[15]  Martin Gaedke,et al.  Silk - A Link Discovery Framework for the Web of Data , 2009, LDOW.

[16]  Ngoc Thanh Nguyen,et al.  A METHOD FOR ONTOLOGY CONFLICT RESOLUTION AND INTEGRATION ON RELATION LEVEL , 2007, Cybern. Syst..

[17]  Kristina Lerman,et al.  Semi-automatically Mapping Structured Sources into the Semantic Web , 2012, ESWC.

[18]  Estevam R. Hruschka,et al.  Toward an Architecture for Never-Ending Language Learning , 2010, AAAI.