Customizing Local Context Analysis for Farsi Information Retrieval by Using a New Concept Weighting Algorithm

A lot of digital Farsi content has been produced recently in middle-east. Local context analysis (LCA) is an automated query expansion method that adds concepts to the original query based on the initial retrieval using the original query. In our previous works we attempted to tune this method for Farsi language by manipulating three parameters which are number of concepts used for query expansion, number of initially retrieved documents for local feedback and number of passages for concept discovery and weighting. In this paper we seek to further customize this method for Farsi information retrieval. To compare our work to the previous attempts we have used Hamshahri collection and 60. We have experimented with different number of concepts and have also changed the concept weighting algorithm to improve retrieval performance.

[1]  W. Bruce Croft,et al.  Improving the effectiveness of information retrieval with local context analysis , 2000, TOIS.

[2]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[3]  Kazem Taghva,et al.  Language model-based retrieval for Farsi documents , 2004, International Conference on Information Technology: Coding and Computing, 2004. Proceedings. ITCC 2004..

[4]  Amir Nayyeri,et al.  FuFaIR: a Fuzzy Farsi Information Retrieval System , 2006, IEEE International Conference on Computer Systems and Applications, 2006..

[5]  Kostas Karpouzis,et al.  Non-Verbal Feedback on User Interest Based on Gaze Direction and Head Pose , 2007 .

[6]  Gerard Salton,et al.  Improving retrieval performance by relevance feedback , 1997, J. Am. Soc. Inf. Sci..

[7]  Farhad Oroumchian,et al.  An Evaluation of Retrieval Performance Using Farsi Text , 2002 .

[8]  W. Bruce Croft,et al.  An Association Thesaurus for Information Retrieval , 1994, RIAO.

[9]  W. Bruce Croft,et al.  Query expansion using local and global document analysis , 1996, SIGIR '96.

[10]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .

[11]  Nicholas J. Belkin,et al.  Relevance Feedback versus Local Context Analysis as Term Suggestion Devices: Rutgers' TREC-8 Interactive Track Experience , 1999, TREC.

[12]  Aviezri S. Fraenkel,et al.  Local Feedback in Full-Text Retrieval Systems , 1977, JACM.

[13]  Kazem Taghva,et al.  A stemming algorithm for the Farsi language , 2005, International Conference on Information Technology: Coding and Computing (ITCC'05) - Volume II.

[14]  Eugene Agichtein,et al.  Predicting Extraction Performance using Context Language Models , 2005 .

[15]  Karen Spärck Jones Collection properties influencing automatic term classification performance , 1973, Inf. Storage Retr..

[16]  Chris Buckley,et al.  Pivoted Document Length Normalization , 1996, SIGIR Forum.

[17]  Farhad Oroumchian,et al.  Assessment of a Modern Farsi Corpus , 2005 .

[18]  Farhad Oroumchian,et al.  N-gram and Local Context Analysis for Persian text retrieval , 2007, 2007 9th International Symposium on Signal Processing and Its Applications.

[19]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..