Abstract This paper is concerned with the organization and retrieval of records in document retrieval systems which admit of imprecision in the form of fuzziness in document characterization and retrieval rules. A mathematical model for such systems, based on the theory of fuzzy sets, is introduced. A document retrieval system, as defined in this paper, is a quadruple (X, D, Q, γ), where X is a collection of the document descriptions (also referred to as index records, or records); D is the descriptor set; Q is a query set; γ: QxX → [0, 1], (called the matching function) assigns to each pair (q, x) where q ϵ Q and x ϵ X, a number γ(q, x) in the interval [0, 1], called the matching index for the query q and the document description x. In our system model, each document description x is defined as a fuzzy set in the descriptor set D. As a fuzzy subset of D, each x is characterized by a membership function μx: D → [0, 1], where μx(d), representing the grade of membership of d in x, is referred to as the index weight of the descriptor d for the document representation x. The retrieval response of the system is defined in terms of the matching function γ. More specifically, given a query q, the index record retrieval response, f(q), is defined to be a fuzzy set in X whose membership function is given by μ ƒ(q) (x) = γ(q, x) . To deal with the organization problems of data in our conceptual model, the conventional concept of a list is extended to a fuzzy list. Specifically, L(d), the fuzzy list corresponding to a descriptor d, is defined as a fuzzy set in the document description set X whose membership function is given by μ l (d) (x) = μ x , (d) . In this way, the notion of an inverted file structure can be extended to the fuzzy data in our retrieval model.
[1]
Tadahiko Takahama.
A model for a document retrieval system
,
1973,
Inf. Storage Retr..
[2]
Wladyslaw M. Turski.
On a model of information retrieval system based on thesaurus
,
1971,
Inf. Storage Retr..
[3]
Frank Harary,et al.
A formal system for information retrieval from files
,
1970,
Commun. ACM.
[4]
Richard C. T. Lee.
Fuzzy Logic and the Resolution Principle
,
1971,
JACM.
[5]
Eugene Wong,et al.
Canonical structure in attribute based file organization
,
1971,
CACM.
[6]
W. S. Cooper.
Expected search length: A single measure of retrieval effectiveness based on the weak ordering action of retrieval systems
,
1968
.
[7]
Gerard Salton,et al.
Automatic Information Organization And Retrieval
,
1968
.
[8]
Krystyna Laus-Maczynska,et al.
A model of information retrieval process for hierarchical set of descriptors
,
1974,
Inf. Storage Retr..
[9]
Gerard Salton,et al.
Dynamic document processing
,
1972,
CACM.
[10]
Vason P. Srini,et al.
Realization of Fuzzy Forms
,
1975,
IEEE Transactions on Computers.
[11]
Lotfi A. Zadeh,et al.
The concept of a linguistic variable and its application to approximate reasoning-III
,
1975,
Inf. Sci..