Towards Knowledge Acquisition from Information Extraction

In our research to use information extraction to help populate the semantic web, we have encountered significant obstacles to interoperability between the technologies. We believe these obstacles to be endemic to the basic paradigms, and not quirks of the specific implementations we have worked with. In particular, we identify five dimensions of interoperability that must be addressed to successfully populate semantic web knowledge bases from information extraction systems that are suitable for reasoning. We call the task of transforming IE data into knowledge-bases knowledge integration, and briefly present a framework called KITE in which we are exploring these dimensions. Finally, we report on the initial results of an experiment in which the knowledge integration process uses the deeper semantics of OWL ontologies to improve the precision of relation extraction from text.

[1]  Ellen M. Voorhees,et al.  The fourteenth text retrieval conference TREC 2005 , 2006 .

[2]  David A. Ferrucci,et al.  UIMA: an architectural approach to unstructured information processing in the corporate research environment , 2004, Natural Language Engineering.

[3]  Paul Buitelaar,et al.  RelExt: A Tool for Relation Extraction from Text in Ontology Extension , 2005, SEMWEB.

[4]  Edith Schonberg,et al.  The Summary Abox: Cutting Ontologies Down to Size , 2006, SEMWEB.

[5]  Jennifer Chu-Carroll,et al.  IBM's PIQUANT II in TREC 2004 , 2004, TREC.

[6]  Ramanathan V. Guha,et al.  SemTag and seeker: bootstrapping the semantic web via automated semantic annotation , 2003, WWW '03.

[7]  Richard Fikes Knowledge Associates for Novel Intelligence ( KANI ) , 2005 .

[8]  Tova Milo,et al.  Using Schema Matching to Simplify Heterogeneous Data Translation , 1998, VLDB.

[9]  Mark A. Przybocki,et al.  The Automatic Content Extraction (ACE) Program – Tasks, Data, and Evaluation , 2004, LREC.

[10]  Mark A. Musen,et al.  Anchor-PROMPT: Using Non-Local Context for Semantic Matching , 2001, OIS@IJCAI.

[11]  Atanas Kiryakov,et al.  KIM – a semantic platform for information extraction and retrieval , 2004, Natural Language Engineering.

[12]  Ido Dagan,et al.  The Third PASCAL Recognizing Textual Entailment Challenge , 2007, ACL-PASCAL@ACL.

[13]  Xiaoqiang Luo,et al.  A Mention-Synchronous Coreference Resolution Algorithm Based On the Bell Tree , 2004, ACL.

[14]  Enrico Motta,et al.  The Semantic Web - ISWC 2005, 4th International Semantic Web Conference, ISWC 2005, Galway, Ireland, November 6-10, 2005, Proceedings , 2005, SEMWEB.

[15]  Rl Sutton-Spence Encyclopedia of Language and Linguistics 2nd Edition , 2006 .

[16]  Johanna Völker,et al.  A Framework for Ontology Learning and Data-driven Change Discovery , 2005 .

[17]  Jennifer Chu-Carroll,et al.  IBM's PIQUANT II in TREC2005 , 2005 .

[18]  Ido Dagan,et al.  Evaluating Predictive Uncertainty, Visual Objects Classification and Recognising textual entailment : selected proceedings of the First PASCAL Machine Learning Challenges Workshop , 2006 .

[19]  Kalina Bontcheva,et al.  Open-source Tools for Creation, Maintenance, and Storage of Lexical Resources for Language Generation from Ontologies , 2004, LREC.

[20]  Hajo Hippner,et al.  Text Mining , 2006, Informatik-Spektrum.

[21]  Diana Maynard,et al.  Ontology-based information extraction for market monitoring and technology watch , 2005 .

[22]  Sergey Bratus,et al.  FactBrowser Demonstration , 2001, HLT.

[23]  York Sure-Vetter,et al.  Automatic Evaluation of Ontologies (AEON) , 2005, SEMWEB.

[24]  Thilo Götz,et al.  Design and implementation of the UIMA Common Analysis System , 2004, IBM Syst. J..

[25]  Elaine Marsh TIPSTER Information Extraction Evaluation: The MUC-7 Workshop , 1998, TIPSTER.

[26]  Diana Maynard,et al.  Benchmarking ontology-based annotation tools for the Semantic Web , 2005 .

[27]  Walt Detmar Meurers,et al.  Encyclopedia of Language and Linguistics , 2006 .

[28]  J. William Murdock,et al.  Obtaining Formal Knowledge from Informal Text Analysis , 2006 .

[29]  Yael Ravin,et al.  Identifying and extracting relations from text , 1999 .

[30]  Shashi Shekhar,et al.  Automatic Information Extraction , 2008, Encyclopedia of GIS.

[31]  Deborah L. McGuinness,et al.  Explaining Conclusions from Diverse Knowledge Sources , 2006, International Semantic Web Conference.

[32]  Dean Allemang,et al.  The Semantic Web - ISWC 2006, 5th International Semantic Web Conference, ISWC 2006, Athens, GA, USA, November 5-9, 2006, Proceedings , 2006, SEMWEB.

[33]  Dunja Mladenic,et al.  Automatic Evaluation of Ontologies , 2007 .

[34]  Deborah L. McGuinness,et al.  A proof markup language for Semantic Web services , 2006, Inf. Syst..