KIM – a semantic platform for information extraction and retrieval

The KIM platform provides a novel Knowledge and Information Management framework and services for automatic semantic annotation, indexing, and retrieval of documents. It provides a mature and semantically enabled infrastructure for scalable and customizable information extraction (IE) as Our understanding is that a system for semantic annotation should be based upon a simple model of real-world entity concepts, complemented with quasi-exhaustive instance knowledge. To ensure efficiency, easy sharing, and reusability of the metadata we introduce an upper-level ontology. Based on the ontology, a large-scale instance base of entity descriptions is maintained. The knowledge resources involved are handled by use of state-of-the-art Semantic Web technology and standards, including RDF(S) repositories, ontology middleware and reasoning. From a technical point of view, the platform allows KIM-based applications to use it for automatic semantic annotation, for content retrieval based on semantic queries, and for semantic repository access. As a framework, KIM also allows various IE modules, semantic repositories and information retrieval engines to be plugged into it. This paper presents the KIM platform, with an emphasis on its architecture, interfaces, front-ends, and other technical issues.

[1]  Paul Dixon,et al.  Oracle at Trec8: A Lexical Approach , 1999, TREC.

[2]  Steffen Staab,et al.  S-CREAM: Semiautomatic CREAtion of Metadata , 2002, SAAKM@ECAI.

[3]  Ramanathan V. Guha,et al.  SemTag and seeker: bootstrapping the semantic web via automated semantic annotation , 2003, WWW '03.

[4]  Christiane Fellbaum,et al.  Using Wordnet for Text Retrieval , 1998 .

[5]  Rada Mihalcea,et al.  Document Indexing using Named Entities , 2001 .

[6]  B. Boguraevz,et al.  Semantic Indexing and Typed Hyperlinking , 1997 .

[7]  Kalina Bontcheva,et al.  Evolving GATE to meet new challenges in language engineering , 2004, Natural Language Engineering.

[8]  Atanas Kiryakov,et al.  Semantic annotation, indexing, and retrieval , 2004, J. Web Semant..

[9]  Hamish Cunningham Information Extraction - A User Guide , 1997, ArXiv.

[10]  Alexiei Dingli,et al.  User-System Cooperation in Document Annotation Based on Information Extraction , 2002, EKAW.

[11]  Wendy Hall,et al.  Conceptual linking: ontology-based open hypermedia , 2001, WWW '01.

[12]  Marc Moens,et al.  Named Entity Recognition without Gazetteers , 1999, EACL.

[13]  Eric Prud'hommeaux,et al.  Annotea: an open RDF infrastructure for shared Web annotations , 2002, Comput. Networks.

[14]  Nicholas Kushmerick,et al.  Wrapper induction: Efficiency and expressiveness , 2000, Artif. Intell..

[15]  John Davies,et al.  QuizRDF: search technology for the semantic Web , 2004, 37th Annual Hawaii International Conference on System Sciences, 2004. Proceedings of the.

[16]  Arthur Stutt,et al.  MnM: Ontology Driven Semi-automatic and Automatic Support for Semantic Markup , 2002, EKAW.

[17]  Kalina Bontcheva,et al.  Semantic Web Enabled, Open Source Language Technology , 2003 .