A fuzzy set approach to concept-based information retrieval

In this paper an information retrieval approach is proposed based on the use of a fuzzy conceptual structure used both to index document and to express user queries. The conceptual structure is hierarchical and it encodes the knowledge of the topical domain of the considered documents. It is formally represented as a weighted tree. The evaluation of conjunctive queries is based on the comparison of minimal subtrees containing the two sets of nodes corresponding to the concepts expressed in the document and the query respectively. The comparison uses different multiplevalued degrees of inclusion, which are discussed. The proposed approach generalizes standard fuzzy information retrieval. Its evaluation is also presented.

[1]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[2]  Gloria Bordogna,et al.  Linguistic aggregation operators of selection criteria in fuzzy information retrieval , 1995, Int. J. Intell. Syst..

[3]  Hele-Mai Haav,et al.  A Survey of Concept-based Information Retrieval Tools on the Web , 2001 .

[4]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[5]  M. F. Porter,et al.  An algorithm for suffix stripping , 1997 .

[6]  Kathleen R. McKeown,et al.  Synthesizing composite topic structure trees for multiple domain specific documents , 2001 .

[7]  Mohand Boughanem,et al.  Qualitative pattern matching with linguistic terms , 2004, AI Commun..

[8]  Alexander F. Gelbukh,et al.  Information Retrieval with Conceptual Graph Matching , 2000, DEXA.

[9]  Didier Dubois,et al.  Semantics of quotient operators in fuzzy relational databases , 1996, Fuzzy Sets Syst..

[10]  Didier Dubois,et al.  Extended Divisions for Flexible Queries in Relational Databases , 2000 .

[11]  Ollivier Haemmerlé,et al.  Representation of weakly structured imprecise data for fuzzy querying , 2003, Fuzzy Sets Syst..

[12]  Donna K. Harman,et al.  Overview of the Sixth Text REtrieval Conference (TREC-6) , 1997, Inf. Process. Manag..

[13]  Nicola Guarino,et al.  OntoSeek: content-based access to the Web , 1999, IEEE Intell. Syst..

[14]  Mohand Boughanem,et al.  Semantic cores for representing documents in IR , 2005, SAC '05.

[15]  H. Markov,et al.  An algorithm to , 1997 .