A fuzzy logic approach to information retrieval using an ontology-based representation of documents

Abstract The paper proposes an approach to information retrieval based on the use of a fuzzy conceptual structure (ontology) that is used both for indexing document and expressing user queries. The conceptual structure is hierarchical and it encodes the knowledge of the topical domain of the considered documents. It is formally represented as a weighted tree. In this approach, the evaluation of conjunctive queries is based on the comparison of minimal sub-trees containing the two sets of nodes corresponding to the concepts expressed in the document and the query respectively. The comparison is based on the computation of a multiple-valued degree of inclusion. Some candidate implications are discussed on the basis of their respective semantics. The proposed approach generalizes standard fuzzy information retrieval and its evaluation on benchmark example is also presented.

[1]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[2]  Gabriella Pasi,et al.  A logical formulation of the Boolean model and of weighted Boolean models , 2007 .

[3]  M. F. Porter,et al.  An algorithm for suffix stripping , 1997 .

[4]  Patrick Brézillon,et al.  Lecture Notes in Artificial Intelligence , 1999 .

[5]  Ollivier Haemmerlé,et al.  Representation of weakly structured imprecise data for fuzzy querying , 2003, Fuzzy Sets Syst..

[6]  Didier Dubois,et al.  Semantics of quotient operators in fuzzy relational databases , 1996, Fuzzy Sets Syst..

[7]  Julio Gonzalo,et al.  Indexing with WordNet synsets can improve text retrieval , 1998, WordNet@ACL/COLING.

[8]  L. Azzopardi,et al.  Topic based language models for ad hoc information retrieval , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[9]  Kathleen R. McKeown,et al.  Synthesizing composite topic structure trees for multiple domain specific documents , 2001 .

[10]  Tao Xiong,et al.  A combined SVM and LDA approach for classification , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[11]  Didier Dubois,et al.  Bipolarity in Flexible Querying , 2002, FQAS.

[12]  Gloria Bordogna,et al.  Linguistic aggregation operators of selection criteria in fuzzy information retrieval , 1995, Int. J. Intell. Syst..

[13]  Didier Dubois,et al.  Extended Divisions for Flexible Queries in Relational Databases , 2000 .

[14]  Andrea Omicini,et al.  Proceedings of the 2005 ACM Symposium on Applied Computing (SAC), Santa Fe, New Mexico, USA, March 13-17, 2005 , 2005, SAC.

[15]  K. J. Lynch,et al.  Generating, integrating, and activating thesauri for concept-based document retrieval , 1993, IEEE Expert.

[16]  Mohand Boughanem,et al.  Qualitative pattern matching with linguistic terms , 2004, AI Commun..

[17]  Olga Pons,et al.  Knowledge Management in Fuzzy Databases , 2000 .

[18]  Mohand Boughanem,et al.  Semantic cores for representing documents in IR , 2005, SAC '05.

[19]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[20]  Nicola Guarino,et al.  OntoSeek: content-based access to the Web , 1999, IEEE Intell. Syst..

[21]  Hele-Mai Haav,et al.  A Survey of Concept-based Information Retrieval Tools on the Web , 2001 .

[22]  Donna K. Harman,et al.  Overview of the Sixth Text REtrieval Conference (TREC-6) , 1997, Inf. Process. Manag..

[23]  Alexander F. Gelbukh,et al.  Information Retrieval with Conceptual Graph Matching , 2000, DEXA.

[24]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.