论文信息 - Can We Predict New Facts with Open Knowledge Graph Embeddings? A Benchmark for Open Link Prediction - 字舞流文

Can We Predict New Facts with Open Knowledge Graph Embeddings? A Benchmark for Open Link Prediction

Open Information Extraction systems extract(“subject text”, “relation text”, “object text”)triples from raw text. Some triples are textualversions of facts, i.e., non-canonicalized men-tions of entities and relations. In this paper, weinvestigate whether it is possible to infernewfacts directly from theopen knowledge graphwithout any canonicalization or any supervi-sion from curated knowledge. For this pur-pose, we propose the open link prediction task,i.e., predicting test facts by completing(“sub-ject text”, “relation text”, ?)questions. Anevaluation in such a setup raises the question ifa correct prediction is actually anewfact thatwas induced by reasoning over the open knowl-edge graph or if it can be trivially explained.For example, facts can appear in different para-phrased textual variants, which can lead to testleakage. To this end, we propose an evaluationprotocol and a methodology for creating theopen link prediction benchmark OLPBENCH.We performed experiments with a prototypicalknowledge graph embedding model for openlink prediction. While the task is very chal-lenging, our results suggests that it is possibleto predict genuinely new facts, which can notbe trivially explained.

Rainer Gemulla | Yanjie Wang | Kiril Gashteovski | Samuel Broscheit | Rainer Gemulla | Kiril Gashteovski | Samuel Broscheit | Yanjie Wang

[1] Luciano Del Corro,et al. MinIE: Minimizing Facts in Open Information Extraction , 2017, EMNLP.

[2] Pasquale Minervini,et al. Convolutional 2D Knowledge Graph Embeddings , 2017, AAAI.

[3] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[4] Sebastian Riedel,et al. Constructing Datasets for Multi-hop Reading Comprehension Across Documents , 2017, TACL.

[5] Yoshua Bengio,et al. HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering , 2018, EMNLP.

[6] Jianfeng Gao,et al. Embedding Entities and Relations for Learning and Inference in Knowledge Bases , 2014, ICLR.

[7] Rainer Gemulla,et al. OPIEC: An Open Information Extraction Corpus , 2019, AKBC.

[8] Fabian M. Suchanek,et al. Canonicalizing Open Knowledge Bases , 2014, CIKM.

[9] Guillaume Bouchard,et al. Complex Embeddings for Simple Link Prediction , 2016, ICML.

[10] Lise Getoor,et al. Knowledge Graph Identification , 2013, SEMWEB.

[11] Andrew McCallum,et al. Relation Extraction with Matrix Factorization and Universal Schemas , 2013, NAACL.

[12] Evgeniy Gabrilovich,et al. A Review of Relational Machine Learning for Knowledge Graphs , 2015, Proceedings of the IEEE.

[13] Partha Talukdar,et al. CESI: Canonicalizing Open Knowledge Bases using Embeddings and Side Information , 2018, WWW.

[14] Yuji Matsumoto,et al. Knowledge Transfer for Out-of-Knowledge-Base Entities: A Graph Neural Network Approach , 2017, ArXiv.

[15] Michael Gamon,et al. Representing Text for Joint Embedding of Text and Knowledge Bases , 2015, EMNLP.

[16] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .

[17] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[18] Jason Weston,et al. Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[19] Tim Weninger,et al. Open-World Knowledge Graph Completion , 2017, AAAI.

[20] Vít Novácek,et al. Drug target discovery using knowledge graph embeddings , 2019, SAC.

[21] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[22] Henry A. Kautz,et al. Hardening soft information sources , 2000, KDD '00.

[23] Oren Etzioni,et al. Open Information Extraction: The Second Generation , 2011, IJCAI.

[24] Andrew McCallum,et al. Generalizing to Unseen Entities and Entity Pairs with Row-less Universal Schema , 2016, EACL.

[25] Zhiyong Wu,et al. Towards Practical Open Knowledge Base Canonicalization , 2018, CIKM.

[26] Fabio Petroni,et al. CORE: Context-Aware Open Relation Extraction with Factorization Machines , 2015, EMNLP.