Learning ontologies from natural language texts

Research on ontology is becoming increasingly widespread in the computer science community. The major problems in building ontologies are the bottleneck of knowledge acquisition and time-consuming construction of various ontologies for various domains/ applications. Meanwhile moving toward automation of ontology construction is a solution.We proposed an automatic ontology building approach. In this approach, the system starts from a small ontology kernel and constructs the ontology through text understanding automatically. The kernel contains the primitive concepts, relations and operators to build an ontology. The features of our proposed model are being domain/application independent, building ontologies upon a small primary kernel, learning words, concepts, taxonomic and non-taxonomic relations and axioms and applying a symbolic, hybrid ontology learning approach consisting of logical, linguistic based, template driven and semantic analysis methods.Hasti is an ongoing project to implement and test the automatic ontology building approach. It extracts lexical and ontological knowledge from Persian (Farsi) texts.In this paper, at first, we will describe some ontology engineering problems, which motivated our approach. In the next sections, after a brief description of Hasti, its features and its architecture, we will discuss its components in detail. In each part, the learning algorithms will be described. Then some experimental results will be discussed and at last, we will have an overview of related works and will introduce a general framework to compare ontology learning systems and will compare Hasti with related works according to the framework.

[1]  M. Gernsbacher,et al.  Proceedings of the 20th Annual Conference of the Cognitive Science Society , 1998 .

[2]  David Faure,et al.  Acquisition of Semantic Knowledge using Machine learning methods: The System ASIUM Technical report , 1998 .

[3]  Mark A. Musen,et al.  PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment , 2000, AAAI/IAAI.

[4]  Naftali Tishby,et al.  Distributional Clustering of English Words , 1993, ACL.

[5]  Michael R. Genesereth,et al.  Knowledge Interchange Format , 1991, KR.

[6]  Raymond J. Mooney,et al.  Automatic Construction of Semantic Lexicons for Learning Natural Language Interfaces , 1999, AAAI/IAAI.

[7]  Mehrnoush Shamsfard,et al.  The state of the art in ontology learning: a framework for comparison , 2003, The Knowledge Engineering Review.

[8]  Houssem Assadi Knowledge Acquisition from Texts: Using an Automatic Clustering Method Based on Noun-Modifier Relationship , 1997, ACL.

[9]  John F. Sowa,et al.  Top-level ontological categories , 1995, Int. J. Hum. Comput. Stud..

[10]  David Faure,et al.  A corpus-based conceptual clustering method for verb frames and ontology , 1998 .

[11]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[12]  Houssem Assadi,et al.  Knowledge Acquisition from Texts: Using an Automatic Clustering Method Based on Noun-Modifier Relationship , 1997, Annual Meeting of the Association for Computational Linguistics.

[13]  Dietrich Wettschereck,et al.  Relational Instance-Based Learning , 1996, ICML.

[14]  Kirsten Malmkjaer,et al.  The Linguistics Encyclopedia , 2002 .

[15]  Ramanathan V. Guha,et al.  Building large knowledge-based systems , 1989 .

[16]  Patrick Saint-Dizier,et al.  Computational Lexical Semantics , 2005 .

[17]  Gilles Bisson,et al.  Learning in FOL with a Similarity Measure , 1992, AAAI.

[18]  John W. Lloyd,et al.  Classification of Individuals with Complex Structure , 2000, ICML.

[19]  Martin Romacker,et al.  Content management in the SYNDIKATE system - How technical documents are automatically transformed to text knowledge bases , 2000, Data Knowl. Eng..

[20]  Håkan Sundblad Automatic Acquisition of Hyponyms and Meronyms from Question Corpora , 2002 .

[21]  Takahira Yamaguchi Acquiring Conceptual Relationships from Domain-Specific Texts , 2001, Workshop on Ontology Learning.

[22]  Steffen Staab,et al.  Ontology Learning for the Semantic Web , 2002, IEEE Intell. Syst..

[23]  Andreas Wagner,et al.  Enriching a lexical semantic net with selectional preferences by means of statistical corpus analysis , 2000, ECAI Workshop on Ontology Learning.

[24]  Kevin Knight,et al.  Building a Large-Scale Knowledge Base for Machine Translation , 1994, AAAI.

[25]  Ryutaro Ichise,et al.  Rule Induction for Concept Hierarchy Alignment , 2001, Workshop on Ontology Learning.

[26]  Georg Groh,et al.  Facilitating the Exchange of Explicit Knowledge through Ontology Mappings , 2001, FLAIRS.

[27]  Udo Hahn,et al.  Towards Text Knowledge Engineering , 1998, AAAI/IAAI.

[28]  Enrico Motta,et al.  Template Driven Information Extraction for Populating Ontologies , 2001, Workshop on Ontology Learning.

[29]  Nicola Guarino,et al.  An Ontology of Meta-Level Categories , 1994, KR.

[30]  Sergei Nirenburg,et al.  Syntax-Driven and Ontology-Driven Lexical Semantics , 1991, SIGLEX Workshop.

[31]  Raymond J. Mooney,et al.  Learning Semantic Grammars with Constructive Inductive Logic Programming , 1993, AAAI.

[32]  Tom M. Mitchell,et al.  Learning to Extract Symbolic Knowledge from the World Wide Web , 1998, AAAI/IAAI.

[33]  Emmanuel Morin,et al.  Extracting Semantic Relationships between Terms: Supervised vs. Unsupervised Methods , 1999 .

[34]  Tom M. Mitchell,et al.  Learning to construct knowledge bases from the World Wide Web , 2000, Artif. Intell..

[35]  Douglas B. Lenat,et al.  CYC: a large-scale investment in knowledge infrastructure , 1995, CACM.

[36]  A Abd Elahzadeh Barforoush,et al.  COMPUTATIONAL LEXICON: THE CENTRAL STRUCTURE IN NATURAL LANGUAGE PROCESSING SYSTEMS , 2001 .

[37]  Steffen Staab,et al.  Discovering Conceptual Relations from Text , 2000, ECAI.

[38]  Raphael Volz,et al.  Semi-automatic Ontology Acquisition from a Corporate Intranet , 2000 .

[39]  Asunción Gómez-Pérez,et al.  METHONTOLOGY: From Ontological Art Towards Ontological Engineering , 1997, AAAI 1997.

[40]  Sergei Nirenburg,et al.  Lexicons in the MikroKosmos Project , 1996 .

[41]  Martin Romacker,et al.  The SynDiKATe Text Knowledge Base Generator , 2001, HLT.

[42]  Ramanathan V. Guha,et al.  Building Large Knowledge-Based Systems: Representation and Inference in the Cyc Project , 1990 .

[43]  Marti A. Hearst Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[44]  Christian Wolff,et al.  Learning Relations Using Collocations , 2001, Workshop on Ontology Learning.

[45]  Steffen Staab,et al.  Semi-Automatic Engineering of Ontologies from Text , 2000, ICSE 2000.

[46]  James Pustejovsky,et al.  Lexical Semantics and Knowledge Representation , 1991, Lecture Notes in Computer Science.

[47]  Richard M. Schwartz,et al.  An Algorithm that Learns What's in a Name , 1999, Machine Learning.

[48]  A. Kabir,et al.  The state of the art in ontology learning : a framework for comparison , 2004 .

[49]  Steffen Staab,et al.  Ontology Learning Part One - On Discoverying Taxonomic Relations from the Web , 2002 .

[50]  Olatz Ansa,et al.  Enriching very large ontologies using the WWW , 2000, ECAI Workshop on Ontology Learning.

[51]  Paul Compton,et al.  Learning Classification taxonomies from a classification knowledge based system , 2000, ECAI Workshop on Ontology Learning.

[52]  Alan Smaill,et al.  Proceedings of the 14th European Conference on Artificial Intelligence (ECAI 2000) , 2000 .

[53]  Peter Wiemer-Hastings,et al.  Inferring the Meaning of Verbs from Context , 1999 .