Link Prediction of Weighted Triples for Knowledge Graph Completion Within the Scholarly Domain

Knowledge graphs (KGs) are widely used for modeling scholarly communication, performing scientometric analyses, and supporting a variety of intelligent services to explore the literature and predict research dynamics. However, they often suffer from incompleteness (e.g., missing affiliations, references, research topics), leading to a reduced scope and quality of the resulting analyses. This issue is usually tackled by computing knowledge graph embeddings (KGEs) and applying link prediction techniques. However, only a few KGE models are capable of taking weights of facts in the knowledge graph into account. Such weights can have different meanings, e.g. describe the degree of association or the degree of truth of a certain triple. In this paper, we propose the Weighted Triple Loss, a new loss function for KGE models that takes full advantage of the additional numerical weights on facts and it is even tolerant to incorrect weights. We also extend the Rule Loss, a loss function that is able to exploit a set of logical rules, in order to work with weighted triples. The evaluation of our solutions on several knowledge graphs indicates significant performance improvements with respect to the state of the art. Our main use case is the large-scale AIDA knowledge graph, which describes 21 million research articles. Our approach enables to complete information about affiliation types, countries, and research topics, greatly improving the scope of the resulting scientometrics analyses and providing better support to systems for monitoring and predicting research dynamics.

[1]  Guillermo Palma,et al.  Unveiling Scholarly Communities over Knowledge Graphs , 2018, TPDL.

[2]  Guillaume Bouchard,et al.  Complex Embeddings for Simple Link Prediction , 2016, ICML.

[3]  Sahar Vahdati,et al.  5* Knowledge Graph Embeddings with Projective Transformations , 2021, AAAI.

[4]  Heiner Stuckenschmidt,et al.  Marrying Uncertainty and Time in Knowledge Graphs , 2017, AAAI.

[5]  Jens Lehmann,et al.  Metaresearch Recommendations using Knowledge Graph Embeddings , 2018 .

[6]  Catherine Havasi,et al.  ConceptNet 5.5: An Open Multilingual Graph of General Knowledge , 2016, AAAI.

[7]  S. Ankrah,et al.  Universities-Industry Collaboration: A Systematic Review , 2015 .

[8]  Guy Shani,et al.  A Survey of Accuracy Evaluation Metrics of Recommendation Tasks , 2009, J. Mach. Learn. Res..

[9]  Michael Krauthammer,et al.  Decentralized provenance-aware publishing with nanopublications , 2016, PeerJ Prepr..

[10]  Hans-Peter Kriegel,et al.  A Three-Way Model for Collective Learning on Multi-Relational Data , 2011, ICML.

[11]  Jinseok Kim,et al.  Evaluating author name disambiguation for digital libraries: a case of DBLP , 2018, Scientometrics.

[12]  Philip S. Yu,et al.  A Survey on Knowledge Graphs: Representation, Acquisition, and Applications , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[13]  Yuxiao Dong,et al.  Microsoft Academic Graph: When experts are not enough , 2020, Quantitative Science Studies.

[14]  Fabian M. Suchanek,et al.  Fast rule mining in ontological knowledge bases with AMIE+\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$+$$\end{docu , 2015, The VLDB Journal.

[15]  Andrea Giovanni Nuzzolese,et al.  Semantic Web Conference Ontology - A Refactoring Solution , 2016, ESWC.

[16]  Nicolas Usunier,et al.  Canonical Tensor Decomposition for Knowledge Base Completion , 2018, ICML.

[17]  Francesco Osborne,et al.  Geographical trends in academic conferences: An analysis of authors' affiliations , 2019, Data Sci..

[18]  Sahar Vahdati,et al.  Embedding-Based Recommendations on Scholarly Knowledge Graphs , 2020, ESWC.

[19]  Juan Pablo Cedeno,et al.  A Framework for Top-K Queries over Weighted RDF Graphs , 2010 .

[20]  Steffen Staab,et al.  Knowledge graphs , 2021, Commun. ACM.

[21]  Zhiyuan Liu,et al.  Graph Neural Networks: A Review of Methods and Applications , 2018, AI Open.

[22]  Shiping Wang,et al.  A Survey on Knowledge Graph Embedding: Approaches, Applications and Benchmarks , 2020, Electronics.

[23]  K. Selçuk Candan,et al.  R2DF framework for ranked path queries over weighted RDF graphs , 2011, WIMS '11.

[24]  Natanael Arndt,et al.  OpenResearch: Collaborative Management of Scholarly Communication Metadata , 2016, EKAW.

[25]  Gabriel Stanovsky,et al.  Recognizing Mentions of Adverse Drug Reaction in Social Media Using Knowledge-Infused Recurrent Models , 2017, EACL.

[26]  Michael Small,et al.  The role of direct links for link prediction in evolving networks , 2017 .

[27]  Francesco Osborne,et al.  The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly Articles , 2019, TPDL.

[28]  Linyuan Lu,et al.  Link Prediction in Complex Networks: A Survey , 2010, ArXiv.

[29]  Jodi Schneider,et al.  Using the Micropublications Ontology and the Open Annotation Data Model to Represent Evidence within a Drug-Drug Interaction Knowledge Base , 2014, LISC@ISWC.

[30]  Jianfeng Gao,et al.  Embedding Entities and Relations for Learning and Inference in Knowledge Bases , 2014, ICLR.

[31]  David M. Shotton,et al.  Semantic publishing: the coming revolution in scientific journal publishing , 2009, Learn. Publ..

[32]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[33]  Jens Lehmann,et al.  Knowledge Graph Embeddings in Geometric Algebras , 2020, COLING.

[34]  Diego Reforgiato Recupero,et al.  AI-KG: An Automatically Generated Knowledge Graph of Artificial Intelligence , 2020, SEMWEB.

[35]  Sahar Vahdati,et al.  Let the Margin SlidE± for Knowledge Graph Embeddings via a Correntropy Objective Function , 2020, 2020 International Joint Conference on Neural Networks (IJCNN).

[36]  Diego Reforgiato Recupero,et al.  Integrating Knowledge Graphs for Analysing Academia and Industry Dynamics , 2020, ADBIS/TPDL/EDA Workshops.

[37]  Aidan Hogan,et al.  WiSP: Weighted Shortest Paths for RDF Graphs , 2018, VOILA@ISWC.

[38]  Jian-Yun Nie,et al.  RotatE: Knowledge Graph Embedding by Relational Rotation in Complex Space , 2018, ICLR.

[39]  Paul T. Groth,et al.  The anatomy of a nanopublication , 2010, Inf. Serv. Use.

[40]  Damian Szklarczyk,et al.  The STRING database in 2017: quality-controlled protein–protein association networks, made broadly accessible , 2016, Nucleic Acids Res..

[41]  James T. Kwok,et al.  Generalizing from a Few Examples , 2019, ACM Comput. Surv..

[42]  Michael Small,et al.  The key to the weak-ties phenomenon , 2019, EPL (Europhysics Letters).

[43]  Silvio Peroni,et al.  The SPAR Ontologies , 2018, SEMWEB.

[44]  Diego Reforgiato Recupero,et al.  Annotated RDF , 2006, TOCL.

[45]  K. Selçuk Candan,et al.  R2DB: A System for Querying and Visualizing Weighted RDF Graphs , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[46]  Jens Lehmann,et al.  LogicENN: A Neural Based Knowledge Graphs Embedding Model With Logical Rules , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  Tao Zhou,et al.  Link prediction in weighted networks: The role of weak ties , 2010 .

[48]  Li Guo,et al.  Learning Knowledge Embeddings by Combining Limit-based Scoring Loss , 2017, CIKM.

[49]  Enrico Motta,et al.  The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas , 2018, SEMWEB.

[50]  Jens Lehmann,et al.  Soft Marginal TransE for Scholarly Knowledge Graph Completion , 2019, ArXiv.

[51]  Sören Auer,et al.  Open Research Knowledge Graph: Next Generation Infrastructure for Semantic Scholarly Knowledge , 2019, K-CAP.

[52]  Carl T. Bergstrom,et al.  The Science of Science , 2018, Science.

[53]  Francesco Osborne,et al.  ResearchFlow: Understanding the Knowledge Flow Between Academia and Industry , 2020, EKAW.

[54]  Diego Reforgiato Recupero,et al.  Trans4E: Link Prediction on Scholarly Knowledge Graphs , 2021, Neurocomputing.

[55]  Jeffrey Brainard,et al.  Scientists are drowning in COVID-19 papers. Can new tools keep them afloat? , 2020 .

[56]  Buzhou Tang,et al.  A Method to Learn Embedding of a Probabilistic Medical Knowledge Graph: Algorithm Development , 2020, JMIR medical informatics.

[57]  Yizhou Sun,et al.  Embedding Uncertain Knowledge Graphs , 2018, AAAI.

[58]  Henk F. Moed,et al.  Studying scientific migration in Scopus , 2013, Scientometrics.

[59]  Francesco Osborne,et al.  The Computer Science Ontology: A Comprehensive Automatically-Generated Taxonomy of Research Areas , 2020, Data Intelligence.

[60]  Michael Small,et al.  Fitness networks for real world systems via modified preferential attachment , 2017 .

[61]  Rui Zhang,et al.  Incorporating Knowledge Graph Embeddings into Topic Modeling , 2017, AAAI.

[62]  Philip S. Yu,et al.  A Comprehensive Survey on Graph Neural Networks , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[63]  D. Basak,et al.  Support Vector Regression , 2008 .

[64]  Lina Yao,et al.  Quaternion Knowledge Graph Embeddings , 2019, NeurIPS.

[65]  Silvio Peroni,et al.  OpenCitations, an infrastructure organization for open scholarship , 2019, Quantitative Science Studies.