Improving Recall for Hindi, Telugu, Oromo to English CLIR

This paper presents the Cross Language Information Retrieval (CLIR) experiments of the Language Technologies Research Centre (LTRC, IIIT-Hyderabad) as part of our participation in the ad-hoc track of CLEF 2007. We present approaches to improve recall of query translation by handling morphological and spelling variations in source language keywords. We also present experiments using query expansion in CLIR using a source language monolingual corpus for Hindi, Telugu and English. We also present the effect of using an Oromo stemmer in Oromo-English CLIR system and report results using the CLEF 2007 dataset.

[1]  Fredric C. Gey,et al.  ENSM-SE at CLEF 2006 : Fuzzy Proximity Method with an Adhoc Influence Function in Evaluation of Multilingual and Multi-modal Information Retrieval 7th Workshop of the Cross-Language Evaluation Forum, CLEF 2006, Alicante, Spain , 2007 .

[2]  Prasad Pingali,et al.  Evaluation of Oromo-English Cross-Language Information Retrieval , 2006 .

[3]  Atelach Alemu Argaw,et al.  Dictionary-based Amharic - English Information Retrieval , 2004, CLEF.

[4]  Ari Pirkola,et al.  Afrikaans-English cross-language information retrieval , 2004 .

[5]  A. Kumaran,et al.  Cross-Lingual Information Retrieval System for Indian Languages , 2008, IJCNLP.

[6]  Douglas W. Oard,et al.  The surprise language exercises , 2003, TALIP.

[7]  Vasudeva Varma,et al.  IIIT Hyderabad at CLEF 2007 - Adhoc Indian Language CLIR Task , 2007, CLEF.

[8]  Gregory Grefenstette,et al.  Cross-Language Information Retrieval , 1998, The Springer International Series on Information Retrieval.

[9]  Pushpak Bhattacharyya,et al.  Hindi to English and Marathi to English Cross Language Information Retrieval Evaluation , 2008, CLEF.

[10]  Fredric C. Gey,et al.  Accessing Multilingual Information Repositories, 6th Workshop of the Cross-Language Evalution Forum, CLEF 2005, Vienna, Austria, 21-23 September, 2005, Revised Selected Papers , 2006, CLEF.

[11]  Pushpak Bhattacharyya,et al.  Hindi and Marathi to English Cross Language Information Retrieval , 2007, IJCNLP.

[12]  Vasudeva Varma,et al.  Oromo-English Information Retrieval Experiments at CLEF 2007 , 2007, CLEF.

[13]  Atelach Alemu Argaw,et al.  Dictionary-based Amharic-French Information Retrieval , 2005, CLEF.

[14]  Vasudeva Varma,et al.  WebKhoj: Indian language IR from multiple character encodings , 2006, WWW '06.