An Integrated Approach to Information Retrieval with Fuzzy Clustering and Fuzzy Inferencing

We present an integrated approach to fuzzy information retrieval which combines techniques in fuzzy set theory with methodologies in textual retrieval in order to achieve optimal retrieval performance. To capture the relationships among index terms, fuzzy logic rules (with truth value assignment in the 0–1 interval) are used. We adapt several fuzzy clustering methods (such as fuzzy c-means and fuzzy hierarchical clustering) to the task of clustering documents with respect to the terms. The clusters generated provide a basis for building the fuzzy logic rules. The clusters can also be used to form hyperlinks between documents. A previously developed fuzzy logic system, found to be sound and complete, is applied for fuzzy inferencing to derive useful modifications of the initial query, which will guide the search for relevant documents. Thus, this method combines fuzzy inference with traditional relevance feedback approach for retrieval. The advantage of this method is in the emphasis on semantic information (embodied in the rules and the inference mechanisms) which should lead to superior performance. A series of experiments conducted in order to validate this approach are presented, along with results and conclusions.

[1]  W. T. Tucker,et al.  Convergence theory for fuzzy c-means: Counterexamples and repairs , 1987, IEEE Transactions on Systems, Man, and Cybernetics.

[2]  George J. Klir,et al.  Fuzzy sets and fuzzy logic - theory and applications , 1995 .

[3]  John A. Hartigan,et al.  Clustering Algorithms , 1975 .

[4]  Ricardo Baeza-Yates,et al.  Information Retrieval: Data Structures and Algorithms , 1992 .

[5]  Fred Petry Advances in databases and artificial intelligence , 1995 .

[6]  A. Mikulcic,et al.  Experiments on using fuzzy clustering for fuzzy control system design , 1996, Proceedings of IEEE 5th International Fuzzy Systems.

[7]  S. Kundu,et al.  FLIC: fuzzy linear invariant clustering for applications in fuzzy control , 1994, NAFIPS/IFIS/NASA '94. Proceedings of the First International Joint Conference of The North American Fuzzy Information Processing Society Biannual Conference. The Industrial Fuzzy Control and Intellige.

[8]  George J. Klir,et al.  Fuzzy sets, uncertainty and information , 1988 .

[9]  W. Bruce Croft Approaches to Intelligent Information Retrieval , 1987, Inf. Process. Manag..

[10]  F. Martin McNeill,et al.  Fuzzy Logic: A Practical Approach , 1994 .

[11]  Didier Dubois,et al.  Fuzzy information engineering: a guided tour of applications , 1997 .

[12]  Kenneth R. Boff,et al.  Computer-Aided Human Factors for Systems Designers , 1991 .

[13]  Lotfi A. Zadeh,et al.  Fuzzy Sets , 1996, Inf. Control..

[14]  Gerard Salton,et al.  Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .

[15]  Patrick Bosc,et al.  Fuzzy databases : principles and applications , 1996 .

[16]  Sukhamay Kundu,et al.  A Sound and Complete Fuzzy Logic System Using Zadeh's Implication Operator , 1996, ISMIS.

[17]  Donald H. Kraft,et al.  Fuzzy Sets and Generalized Boolean Retrieval Systems , 1983, Int. J. Man Mach. Stud..

[18]  James C. Bezdek,et al.  A Convergence Theorem for the Fuzzy ISODATA Clustering Algorithms , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  D. Kraft,et al.  An extended fuzzy linguistic approach to generalize Boolean information retrieval , 1994 .

[20]  Kenneth R. Boff,et al.  Engineering data compendium : human perception and performance , 1988 .

[21]  Sadaaki Miyamoto,et al.  Fuzzy Sets in Information Retrieval and Cluster Analysis , 1990, Theory and Decision Library.