The KnowledgeStore: an Entity-Based Storage System

This paper describes the KnowledgeStore, a large-scale infrastructure for the combined storage and interlinking of multimedia resources and ontological knowledge. Information in the KnowledgeStore is organized around entities, such as persons, organizations and locations. The system allows (i) to import background knowledge about entities, in form of annotated RDF triples; (ii) to associate resources to entities by automatically recognizing, coreferring and linking mentions of named entities; and (iii) to derive new entities based on knowledge extracted from mentions. The KnowledgeStore builds on state of art technologies for language processing, including document tagging, named entity extraction and cross-document coreference. Its design provides for a tight integration of linguistic and semantic features, and eases the further processing of information by explicitly representing the contexts where knowledge and mentions are valid or relevant. We describe the system and report about the creation of a large-scale KnowledgeStore instance for storing and integrating multimedia contents and background knowledge relevant to the Italian Trentino region.

[1]  Laurie J. Heyer,et al.  Exploring expression data: identification and analysis of coexpressed genes. , 1999, Genome research.

[2]  Huan Liu,et al.  Resource description framework: metadata and its applications , 2001, SKDD.

[3]  David A. Ferrucci,et al.  UIMA: an architectural approach to unstructured information processing in the corporate research environment , 2004, Natural Language Engineering.

[4]  Emanuele Pianta,et al.  I-CAB: the Italian Content Annotation Bank , 2006, LREC.

[5]  Steven Pemberton,et al.  RDFa in XHTML: Syntax and Processing A collection of attributes and processing rules for extending XHTML to support RDF , 2008 .

[6]  Steven Pemberton,et al.  RDFa in XHTML: Syntax and Processing , 2008 .

[7]  Emanuele Pianta,et al.  The TextPro Tool Suite , 2008, LREC.

[8]  Luciano Serafini,et al.  Using Background Knowledge to Support Coreference Resolution , 2010, ECAI.

[9]  Davide Buscaldi,et al.  Grounding toponyms in an Italian local news corpus , 2010, GIR.

[10]  B. Magnini,et al.  Leveraging Entity Linking by Contextualized Background Knowledge A case study for news domain in Italian , 2010 .

[11]  Tom Heath,et al.  Linked Data: Evolving the Web into a Global Data Space , 2011, Linked Data.

[12]  Luciano Serafini,et al.  Contextual Representation and Reasoning with Description Logics , 2011, Description Logics.

[13]  Óscar Corcho,et al.  The landscape of multimedia ontologies in the last decade , 2011, Multimedia Tools and Applications.

[14]  Christian Bizer,et al.  DBpedia spotlight: shedding light on the web of documents , 2011, I-Semantics '11.