Automatic Building of Semantically Rich Domain Models from Unstructured Data

The availability of massive amounts of raw domain data has created an urgent need for sophisticated AI systems with capabilities to find complex and useful information in big-data repositories in real-time. Such systems should have capabilities to process and extract significant information from natural language documents, search and answer complex questions, make sophisticated predictions about future events, and generally interact with users in much more powerful and intuitive ways. To be effective, these systems need a significant amount of domain-specific knowledge in addition to the general-domain knowledge. Ontologies/Knowledge-Bases represent knowledge about domains of interest and serve as the backbone for semantic technologies and applications. However, creating such domain models is time consuming, error prone, and the end product is difficult to maintain. In this paper, we present a novel methodology to automatically build semantically rich knowledge models for specific domains using domain-relevant unstructured data from resources such as web articles, manuals, e-books, blogs, etc. We also present evaluation results for our automatic ontology/knowledge-base generation methodology using freely-available textual resources from the World Wide Web.

[1]  ADRIANA BADULESCU,et al.  A Semantic Scattering model for the automatic interpretation of English genitives , 2009, Natural Language Engineering.

[2]  Philipp Cimiano,et al.  Ontology learning and population from text - algorithms, evaluation and applications , 2006 .

[3]  William A. Woods,et al.  Understanding Subsumption and Taxonomy: A Framework for Progress , 1991, Principles of Semantic Networks.

[4]  He Hu,et al.  Learning OWL ontologies from free texts , 2004, Proceedings of 2004 International Conference on Machine Learning and Cybernetics (IEEE Cat. No.04EX826).

[5]  A. Maedche,et al.  MAFRA — A MApping FRAmework for Distributed Ontologies in the Semantic Web , 2002 .

[6]  James G. Schmolze,et al.  Classification in the KL-ONE Knowledge Representation System , 1983, IJCAI.

[7]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[8]  Sören Auer,et al.  Mapping XML to OWL Ontologies , 2005, Leipziger Informatik-Tage.

[9]  H. Sofia Pinto,et al.  Ontologies: How can They be Built? , 2004, Knowledge and Information Systems.

[10]  Johanna Völker,et al.  A Framework for Ontology Learning and Data-driven Change Discovery , 2005 .

[11]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[12]  Dan I. Moldovan,et al.  Semi-Automatic Domain Ontology Creation from Text Resources , 2010, LREC.

[13]  Boris Motik,et al.  MAFRA - A MApping FRAmework for Distributed Ontologies , 2002, EKAW.

[14]  Grace Hui Yang,et al.  Ontology generation for large email collections , 2008, DG.O.

[15]  Uwe Reyle,et al.  Developing a Protein-Interactions Ontology , 2003, Comparative and functional genomics.

[16]  Diego Calvanese,et al.  The Description Logic Handbook: Theory, Implementation, and Applications , 2003, Description Logic Handbook.

[17]  V. Devedzic,et al.  From UML to ready-to-use OWL ontologies , 2004, 2004 2nd International IEEE Conference on 'Intelligent Systems'. Proceedings (IEEE Cat. No.04EX791).

[18]  Dan I. Moldovan,et al.  A Semantic Approach to Recognizing Textual Entailment , 2005, HLT.

[19]  Marko Grobelnik,et al.  A SURVEY OF ONTOLOGY EVALUATION TECHNIQUES , 2005 .

[20]  Mithun Balakrishna,et al.  Automatic Ontology Creation from Text for National Intelligence Priorities Framework (NIPF) , 2008, OIC.

[21]  Hyunjang Kong,et al.  Design of the automatic ontology building system about the specific domain knowledge , 2006, 2006 8th International Conference Advanced Communication Technology.

[22]  Asunción Gómez-Pérez,et al.  Why Evaluate Ontology Technologies? Because It Works! , 2004, IEEE Intell. Syst..

[23]  Aldo Gangemi,et al.  Modelling Ontology Evaluation and Validation , 2006, ESWC.

[24]  Dan I. Moldovan,et al.  COGEX at the Second Recognizing Textual Entailment Challenge , 2006 .