FuFaIR: a Fuzzy Farsi Information Retrieval System

Persian (Farsi) is one of the languages of Middle East. There are significant amount of Persian documents available in digital form and even more are created every day. Therefore, there is a necessity to implement Information Retrieval System with high precision for this language. This paper discusses the design, implementation and testing of a Fuzzy retrieval system for Persian called FuFaIR. This system also supports Fuzzy quantifiers in its query language. Tests have been conducted using a standard Persian test corpus called Hamshari. The performance results obtained from FuFaIR are positive and they indicate that the FuFaIR could notably outperform well known industry systems such as the vector space model.

[1]  D. Kraft,et al.  Fuzzy Sets and Generalized Boolean Retrieval Systems , 1983, Int. J. Man Mach. Stud..

[2]  Gloria Bordogna,et al.  Linguistic aggregation operators of selection criteria in fuzzy information retrieval , 1995, Int. J. Intell. Syst..

[3]  Kazem Taghva,et al.  A stemming algorithm for the Farsi language , 2005, International Conference on Information Technology: Coding and Computing (ITCC'05) - Volume II.

[4]  Farhad Oroumchian,et al.  An Evaluation of Retrieval Performance Using Farsi Text , 2002 .

[5]  Kazem Taghva,et al.  Language model-based retrieval for Farsi documents , 2004, International Conference on Information Technology: Coding and Computing, 2004. Proceedings. ITCC 2004..

[6]  Senén Barro,et al.  Experiments on using fuzzy quantified sentences in adhoc retrieval , 2004, SAC '04.

[7]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[8]  Tetsuya Morita,et al.  A fuzzy document retrieval system using the keyword connection matrix and a learning method , 1991 .

[9]  Chris Buckley,et al.  Pivoted Document Length Normalization , 1996, SIGIR Forum.

[10]  Edward A. Fox,et al.  Research Contributions , 2014 .