Ontology-Based Semantic Online Classification of Documents : Supporting Users in Searching the Web

In this paper we describe first results of our research on the categorization of texts in order to disambiguate user queries. The dis cussed methods are based on a combination of indexingand ontology-based information retrieval techniques in an interactive retrieval system. We present an approach to classify search results by mapping them to semantic classes that are defined by the senses of a query term. The criteria defining each class or ‘sense folder’ are derived from the concepts of an assigned ontology, here Multiwordnet. By annotating each element of a result set with the ‘sense folder’ it is classified in, the user gets additional information about each item – the specific search term is disambiguated with respect to the underlying document – and can thus decide more easily if the document is relevant or not for his query.

[1]  J. van Leeuwen,et al.  Adaptive Hypermedia and Adaptive Web-Based Systems , 2002, Lecture Notes in Computer Science.

[2]  Andreas Nürnberger,et al.  User Adaptive Methods for Interactive Analysis of Document Databases , 2002 .

[3]  Julio Gonzalo,et al.  Indexing with WordNet synsets can improve text retrieval , 1998, WordNet@ACL/COLING.

[4]  Iadh Ounis,et al.  Report on ACM SIGIR workshop on "semantic web" SWIR 2003 , 2003, SIGF.

[5]  Merging Global and Specialized Linguistic Ontologies Bernardo Magnini and , 2002 .

[6]  Stan Matwin,et al.  Feature Engineering for Text Classification , 1999, ICML.

[7]  Carlo Strapparava,et al.  A Project for the Construction of an Italian Lexical Knowledge Base in the Framework of WordNet , 1994 .

[8]  Julita Vassileva,et al.  Adaptive Hypertext and Hypermedia , 1998, Springer Netherlands.

[9]  Carlo Strapparava,et al.  Multilingual Lexical Knowledge Bases : Applied WordNet Prospects , 2004 .

[10]  Eero Hyvönen,et al.  Ontology-Based Semantic Metadata Validation , 2002 .

[11]  Peter Brusilovsky,et al.  Methods and techniques of adaptive hypermedia , 1996, User Modeling and User-Adapted Interaction.

[12]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[13]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[14]  Timothy W. Finin,et al.  Yahoo! as an ontology: using Yahoo! categories to describe documents , 1999, CIKM '99.

[15]  Steffen Staab,et al.  WordNet improves text document clustering , 2003, SIGIR 2003.

[16]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[17]  Stan Matwin,et al.  Text Classification Using WordNet Hypernyms , 1998, WordNet@ACL/COLING.

[18]  Eneko Agirre,et al.  Word Sense Disambiguation using Conceptual Density , 1996, COLING.

[19]  Steffen Staab,et al.  From Manual to Semi-Automatic Semantic Annotation: About Ontology-Based Text Annotation Tools , 2000, SAIC@COLING.

[20]  Stuart Aitken,et al.  Evaluation of an Ontology-Based Information Retrieval Tool , 2000 .

[21]  Ian Horrocks,et al.  On-To-Knowledge: Ontology-based Tools for Knowledge Management , 2000 .

[22]  Eero Hyvönen,et al.  Ontogator: Combining View- and Ontology-Based Search with Semantic Browsing , 2003 .

[23]  Carlo Strapparava,et al.  Comparing Ontology-Based and Corpus-Based Domain Annotations in WordNet , 2003 .

[24]  B. Hayes THE WEB OF WORDS , 1999 .