Frame Instance Extraction and Clustering for Default Knowledge Building

Obtaining and representing common-sense knowledge, useful in a robotics scenario for planning and making inference about the robots’ surroundings, is a challenging problem, because such knowledge is typically found in unstructured repositories such as text corpora or small handmade resources. The work described in this paper presents a methodology for automatically creating a default knowledge base about real-world objects for the robotics domain. The proposed method relies on clustering frame instances extracted from natural language text as a way of distilling default knowledge. We collect and parse a natural language corpus using the Web as a source, then perform an agglomerative clustering of frame instances according to an appropriately defined similarity measure, and finally extract prototypical frame instances from each cluster and publish them in LOD-complaint format to promote reuse and interoperability.

[1]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[2]  Fabien L. Gandon,et al.  Building a General Knowledge Base of Physical Objects for Robots , 2016, ESWC.

[3]  Marco Pennacchiotti,et al.  Measuring Frame Relatedness , 2009, EACL.

[4]  Gerard de Melo,et al.  FrameBase: Representing N-Ary Relations Using Semantic Frames , 2015, ESWC.

[5]  John B. Lowe,et al.  The Berkeley FrameNet Project , 1998, ACL.

[6]  Elena Cabrio,et al.  KNEWS: Using Logical and Lexical Semantics to Extract Knowledge from Natural Language , 2016, ECAI 2016.

[7]  Oren Etzioni,et al.  Machine Reading , 2006, AAAI.

[8]  Moritz Tenorth,et al.  KNOWROB — knowledge processing for autonomous personal robots , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[9]  Martha Palmer,et al.  Verb Semantics and Lexical Selection , 1994, ACL.

[10]  Steffen Staab,et al.  WonderWeb: Ontology Infrastructure for the Semantic Web , 2004 .

[11]  Estevam R. Hruschka,et al.  Toward an Architecture for Never-Ending Language Learning , 2010, AAAI.

[12]  Catherine Havasi,et al.  Representing General Relational Knowledge in ConceptNet 5 , 2012, LREC.

[13]  Douglas B. Lenat,et al.  CYC: a large-scale investment in knowledge infrastructure , 1995, CACM.

[14]  Doug Downey,et al.  Web-scale information extraction in knowitall: (preliminary results) , 2004, WWW '04.

[15]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[16]  Oren Etzioni,et al.  TextRunner: Open Information Extraction on the Web , 2007, NAACL.