Nature inspired Reasoning for the Semantic Web ( NatuReS )

The success of the Semantic Web research is dependent upon the construction of complete and reliable domain ontologies. In this paper we describe an unsupervised framework for domain ontology enrichment based on mining domain text corpora. Specifically, we enrich the hierarchical backbone of an existing ontology, i.e. its taxonomy, with new domain-specific concepts. The framework is based on an extended model of hierarchical self-organizing maps. As being founded on an unsupervised neural network architecture, the framework can be applied to different languages and domains. Terms extracted by mining a text corpus encode contextual content information, in a distributional vector space. The enrichment behaves like a classification of the extracted terms into the existing taxonomy by attaching them as hyponyms for the nodes of the taxonomy. The experiments reported are in the “Lonely Planet” tourism domain. The taxonomy and the corpus are the ones proposed in the PASCAL ontology learning and population challenge. The experimental results prove that the quality of the enrichment is considerably improved by using semantics based vector representations for the classified (newly added) terms, like the document category histograms (DCH) and the document frequency times inverse term frequency (DF-ITF) weighting scheme.

[1]  Elena Paslaru Bontas Simperl,et al.  A High-Level Architecture of a Metadata-based Ontology Matching Framework , 2006, 17th International Workshop on Database and Expert Systems Applications (DEXA'06).

[2]  Simon Dennis,et al.  A memory-based theory of verbal cognition. , 2005, Cognitive science.

[3]  C. Krumhansl Concerning the Applicability of Geometric Models to Similarity Data : The Interrelationship Between Similarity and Spatial Density , 2005 .

[4]  Slawomir Niedbala OWL-CtxMatch in the OAEI 2006 Alignment Contest , 2006, Ontology Matching.

[5]  Michael Sintek,et al.  Task Description for PASCAL Challenge Evaluating Ontology Learning and Population from Text Contact Person , 2006 .

[6]  Bradley C. Love,et al.  CAB: Connectionist Analogy Builder , 2003, Cogn. Sci..

[7]  Avigdor Gal,et al.  OntoBuilder: fully automatic extraction and consolidation of ontologies from Web sources , 2004, Proceedings. 20th International Conference on Data Engineering.

[8]  D. Gentner,et al.  Structural Alignment during Similarity Comparisons , 1993, Cognitive Psychology.

[9]  Robert L. Goldstone The role of similarity in categorization: providing a groundwork , 1994, Cognition.

[10]  Nick Chater,et al.  The Generalized Universal Law of Generalization , 2001, ArXiv.

[11]  Horst Bunke,et al.  Classes of cost functions for string edit distance , 2006, Algorithmica.

[12]  Roger N. Shepard,et al.  Additive clustering: Representation of similarities as combinations of discrete overlapping properties. , 1979 .

[13]  Kenneth D. Forbus,et al.  Cognitive Modeling of Analogy Events in Physics Problem Solving From Examples , 2007 .

[14]  C. Glymour The Mind's Arrows: Bayes Nets and Graphical Causal Models in Psychology , 2000 .

[15]  Brian Falkenhainer,et al.  The Structure-Mapping Engine: Algorithm and Examples , 1989, Artif. Intell..

[16]  Yuzhong Qu,et al.  The Results of Falcon-AO in the OAEI 2006 Campaign , 2006, Ontology Matching.

[17]  J. Corter Similarity, confusability, and the density hypothesis. , 1987, Journal of experimental psychology. General.

[18]  Avigdor Gal,et al.  A framework for modeling and evaluating automatic semantic reconciliation , 2005, The VLDB Journal.

[19]  Simon Dennis,et al.  An unsupervised method for the extraction of propositional information from text , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Udo Hahn,et al.  Towards Text Knowledge Engineering , 1998, AAAI/IAAI.

[21]  Elisa Bertino,et al.  Measuring the structural similarity among XML documents and DTDs , 2008, Journal of Intelligent Information Systems.

[22]  Patrick Lambrix,et al.  A Tool for Evaluating Ontology Alignment Strategies , 2007, J. Data Semant..

[23]  N. Chater,et al.  Simplicity: a unifying principle in cognitive science? , 2003, Trends in Cognitive Sciences.

[24]  P. Hofstaetter [Similarity]. , 2020, Psyche.

[25]  Esko Ukkonen,et al.  Approximate String Matching with q-grams and Maximal Matches , 1992, Theor. Comput. Sci..

[26]  Peter Szolovits,et al.  What Is a Knowledge Representation? , 1993, AI Mag..

[27]  Erhard Rahm,et al.  Generic Schema Matching with Cupid , 2001, VLDB.

[28]  Steffen Staab,et al.  Ontology Learning from Text , 2000, International Conference on Applications of Natural Language to Data Bases.

[29]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[30]  Changjun Jiang,et al.  GAOM: Genetic Algorithm Based Ontology Matching , 2006, 2006 IEEE Asia-Pacific Conference on Services Computing (APSCC'06).

[31]  N. Chater,et al.  Concepts and Similarity , 1997 .

[32]  Paul Buitelaar,et al.  Ontology Learning from Text: An Overview , 2005 .

[33]  Evgeniy Gabrilovich,et al.  Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis , 2007, IJCAI.

[34]  Abraham Bernstein,et al.  Imprecise RDQL: towards generic retrieval in ontologies using similarity joins , 2006, SAC '06.

[35]  Jr. Allen B. Tucker,et al.  The Computer Science and Engineering Handbook , 1997 .

[36]  Suresh Manandhar,et al.  Extending a Lexical Ontology by a Combination of Distributional Semantics Signatures , 2002, EKAW.

[37]  N. Chater The Search for Simplicity: A Fundamental Cognitive Principle? , 1999 .

[38]  Marti A. Hearst Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[39]  José M. Vidal,et al.  Ontology Matching Using an Artificial Neural Network to Learn Weights , 2007 .

[40]  Erhard Rahm,et al.  Quickmig: automatic schema matching for data migration projects , 2007, CIKM '07.

[41]  Padhraic Smyth,et al.  Combining concept hierarchies and statistical topic models , 2008, CIKM '08.

[42]  A. Tversky Features of Similarity , 1977 .

[43]  Steffen Staab,et al.  Bootstrapping ontology alignment methods with APFEL , 2005, WWW '05.

[44]  R. Shepard,et al.  Toward a universal law of generalization for psychological science. , 1987, Science.

[45]  Gonzalo Navarro,et al.  A guided tour to approximate string matching , 2001, CSUR.

[46]  Johanna Völker,et al.  Towards large-scale, open-domain and ontology-based named entity classification , 2005 .

[47]  Andreas Rauber,et al.  Organizing and Exploring High-Dimensional Data with the Growing Hierarchical Self-Organizing Map , 2002, FSKD.

[48]  York Sure-Vetter,et al.  FOAM - Framework for Ontology Alignment and Mapping - Results of the Ontology Alignment Evaluation Initiative , 2005, Integrating Ontologies.

[49]  Dominic Widdows,et al.  Unsupervised methods for developing taxonomies by combining syntactic and statistical information , 2003, NAACL.

[50]  D. Gentner,et al.  Respects for similarity , 1993 .

[51]  N. Chater,et al.  Similarity as transformation , 2003, Cognition.

[52]  Robert L. Goldstone Similarity, interactive activation, and mapping , 1994 .

[53]  Padhraic Smyth,et al.  Modeling Documents by Combining Semantic Concepts with Unsupervised Statistical Learning , 2008, SEMWEB.

[54]  Carmel Domshlak,et al.  Rank Aggregation for Automatic Schema Matching , 2007, IEEE Transactions on Knowledge and Data Engineering.

[55]  R N Shepard,et al.  Multidimensional Scaling, Tree-Fitting, and Clustering , 1980, Science.

[56]  D. Medin,et al.  Asymmetries of comparison , 1999, Psychonomic bulletin & review.

[57]  Koen Lamberts,et al.  Knowledge, Concepts, and Categories , 1997 .

[58]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[59]  Steffen Staab,et al.  Taxonomy Learning - Factoring the Structure of a Taxonomy into a Semantic Classification Decision , 2002, COLING.

[60]  Fausto Giunchiglia,et al.  S-Match: an Algorithm and an Implementation of Semantic Matching , 2004, ESWS.

[61]  Nick Chater Cognitive science: The logic of human learning , 2000, Nature.

[62]  Stefanos D. Kollias,et al.  A String Metric for Ontology Alignment , 2005, SEMWEB.

[63]  Evan Heit,et al.  Are There Two Kinds of Reasoning , 2005 .

[64]  Samuel Kaski,et al.  Self organization of a massive document collection , 2000, IEEE Trans. Neural Networks Learn. Syst..

[65]  Michael N Jones,et al.  Representing word meaning and order information in a composite holographic lexicon. , 2007, Psychological review.

[66]  Kenneth D. Forbus,et al.  Analogical Learning in a Turn-Based Strategy Game , 2007, IJCAI.

[67]  N Chater,et al.  Categorization by simplicity: a minimum description length approach to unsupervised clustering , 2001, Similarity and Categorization.

[68]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[69]  Gerd Stumme,et al.  FCA-MERGE: Bottom-Up Merging of Ontologies , 2001, IJCAI.

[70]  Gerard Salton,et al.  Improving retrieval performance by relevance feedback , 1997, J. Am. Soc. Inf. Sci..

[71]  Mark Steyvers,et al.  Topics in semantic representation. , 2007, Psychological review.

[72]  Zenon W. Pylyshyn,et al.  Connectionism and cognitive architecture: A critical analysis , 1988, Cognition.

[73]  D. Medin,et al.  The role of theories in conceptual coherence. , 1985, Psychological review.

[74]  AnHai Doan,et al.  iMAP: Discovering Complex Mappings between Database Schemas. , 2004, SIGMOD 2004.

[75]  Li Ding,et al.  Characterizing the Semantic Web on the Web , 2006, SEMWEB.

[76]  Juan-Zi Li,et al.  Result of Ontology Alignment with RiMOM at OAEI'06 , 2006, Ontology Matching.

[77]  Arnon Rosenthal,et al.  eTuner: tuning schema matching software using synthetic scenarios , 2007, The VLDB Journal.

[78]  T. Landauer,et al.  A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[79]  Erhard Rahm,et al.  COMA - A System for Flexible Combination of Schema Matching Approaches , 2002, VLDB.

[80]  Nick Chater,et al.  A simplicity principle in unsupervised human categorization , 2002, Cogn. Sci..

[81]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[82]  Hans Friedrich Witschel,et al.  Using Decision Trees and Text Mining Techniques for Extending Taxonomies , 2005 .

[83]  Abraham Bernstein,et al.  The Fundamentals of iSPARQL: A Virtual Triple Approach for Similarity-Based Semantic Web Tasks , 2007, ISWC/ASWC.

[84]  Erhard Rahm,et al.  Schema and ontology matching with COMA++ , 2005, SIGMOD '05.