论文信息 - Fuzzy document retrieval for Portuguese language

Fuzzy document retrieval for Portuguese language

This paper reports a model of document retrieval for the Portuguese language, developed from a Miyamoto document-retrieval model. The Miyamoto model is based upon semantic similarity detection of descriptors by co-occurrences. The proposed model may be considered an extension of the Miyamoto model because it considers lexical similarities and expression similarities. Hence, descriptors and queries are expressions, i.e. series of words and connectors (prepositions, etc.). The similarity between words is based on the comparison between possible radicals for the detection of words with identical or similar meanings. The expression similarity is determined by comparing words and connectors using an adaptation of a Bruza and van der Weide (1991) model. The proposed Miyamoto model extension considers both: the determination of a fuzzy thesaurus by a fuzzy index, achieved through lexical descriptor similarities; and the possibility of a non-controlled vocabulary use by the determination of similarities between document descriptors and query expressions. A sample document base was created for the comparison between the models. The results show the usefulness of the proposed model for document retrieval in the Portuguese language.

Raul Sidnei Wazlawick | Bernd Heinrich Storb

[1] Peter Bruza,et al. The modelling and retrieval of documents using index expressions , 1991, SIGF.

[2] S. Miyamoto. Information retrieval based on fuzzy associations , 1990 .