Automating the expansion of a knowledge graph

Abstract In order to make computers understand human languages and to reason, human knowledge needs to be represented and stored in a form that can be processed by computers. Knowledge graphs have been developed for use as a form of the knowledge base for words and general relationships among words. However, they have two limitations. One is that the knowledge graph is limited in size and scope for most of the human languages. Another is that they are not able to deal with neologisms that form a part of the human common sense. Addressing these problems, we have developed and validated PolarisX which can automatically expand a knowledge graph, by crawling and analyzing the news sites and social media in real-time. We utilize and fine-tune the pre-trained multilingual BERT model for the construction of knowledge graphs without language dependencies. We extract new relationships using the BERT-based relation extraction model and integrate them into the knowledge graph. We verify the novelty and accuracy of PolarisX. It deals with neologisms and does not have language dependencies.

[1]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[2]  Juan-Manuel Torres-Moreno,et al.  Detecting new word meanings: a comparison of word embedding models in Spanish , 2020, CORIA.

[3]  Isabelle Augenstein,et al.  Web relation extraction with distant supervision , 2016 .

[4]  John A. Barnden,et al.  Semantic Networks , 1998, Encyclopedia of Social Network Analysis and Mining.

[5]  Markus Krötzsch,et al.  Wikidata , 2014, Commun. ACM.

[6]  Ji Zhou Internet Newborn Word Recognition Method under Conditional Random Field Model , 2018, 2018 International Conference on Virtual Reality and Intelligent Systems (ICVRIS).

[7]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[8]  Fei Cheng,et al.  Neologisms detection in a overlapping topical complex network , 2016, 2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD).

[9]  Le Song,et al.  Know-Evolve: Deep Temporal Reasoning for Dynamic Knowledge Graphs , 2017, ICML.

[10]  Catherine Havasi,et al.  ConceptNet: A lexical resource for common sense knowledge , 2009 .

[11]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[12]  Hugo Liu,et al.  ConceptNet — A Practical Commonsense Reasoning Tool-Kit , 2004 .

[13]  Heiko Paulheim,et al.  Knowledge graph refinement: A survey of approaches and evaluation methods , 2016, Semantic Web.

[14]  Nicola Guarino,et al.  Ontologies and Knowledge Bases. Towards a Terminological Clarification , 1995 .

[15]  Erik T. Mueller,et al.  Open Mind Common Sense: Knowledge Acquisition from the General Public , 2002, OTM.

[16]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[17]  John Woods,et al.  Survey on Chatbot Design Techniques in Speech Conversation Systems , 2015 .

[18]  Gerhard Weikum,et al.  WebChild: harvesting and organizing commonsense knowledge from the web , 2014, WSDM.

[19]  Ming Gao,et al.  A retrospective of knowledge graphs , 2018, Frontiers of Computer Science.

[20]  Yejin Choi,et al.  COMET: Commonsense Transformers for Knowledge Graph Construction , 2019 .

[21]  Danqi Chen,et al.  Position-aware Attention and Supervised Data Improve Slot Filling , 2017, EMNLP.

[22]  Haixun Wang,et al.  Probase: a probabilistic taxonomy for text understanding , 2012, SIGMOD Conference.

[23]  Xiang Li,et al.  Commonsense Knowledge Base Completion , 2016, ACL.

[24]  Lihong Li,et al.  Neural Approaches to Conversational AI , 2019, Found. Trends Inf. Retr..

[25]  Dieter Fensel,et al.  Knowledge Engineering: Principles and Methods , 1998, Data Knowl. Eng..

[26]  Okran Jeong,et al.  Social media contents based sentiment analysis and prediction system , 2018, Expert Syst. Appl..

[27]  Estevam R. Hruschka,et al.  Coupled semi-supervised learning for information extraction , 2010, WSDM '10.

[28]  Lise Getoor,et al.  Knowledge Graph Identification , 2013, SEMWEB.

[29]  Chen Li,et al.  AsterixDB: A Scalable, Open Source BDMS , 2014, Proc. VLDB Endow..

[30]  Axel Bruns,et al.  Blogs, Wikipedia, Second Life, and Beyond: From Production to Produsage , 2008 .

[31]  Christopher D. Manning,et al.  Graph Convolution over Pruned Dependency Trees Improves Relation Extraction , 2018, EMNLP.

[32]  Liyang Yu Linked Open Data , 2011 .

[33]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[34]  Wolfram Wöß,et al.  Towards a Definition of Knowledge Graphs , 2016, SEMANTiCS.

[35]  Denny Vrandecic,et al.  Wikidata: a new platform for collaborative data collection , 2012, WWW.

[36]  Yejin Choi,et al.  ATOMIC: An Atlas of Machine Commonsense for If-Then Reasoning , 2019, AAAI.

[37]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[38]  Heri Ramampiaro,et al.  End-to-End Machine Learning with Apache AsterixDB , 2018, DEEM@SIGMOD.

[39]  Lei Zhang,et al.  Knowledge graph theory and structural parsing , 2002 .

[40]  Catherine Havasi,et al.  Representing General Relational Knowledge in ConceptNet 5 , 2012, LREC.

[41]  Peng Zhang,et al.  XLore: A Large-scale English-Chinese Bilingual Knowledge Graph , 2013, SEMWEB.

[42]  Douglas B. Lenat,et al.  CYC: a large-scale investment in knowledge infrastructure , 1995, CACM.