Hindi, Telugu, Oromo, English CLIR Evaluation

This paper presents the Cross Language Information Retrieval (CLIR) experiments of Language Technologies Research Centre (LTRC, IIIT-Hyderabad) as part of our participation in the ad-hoc track of CLEF 2006. This is our first participation in the CLEF evaluation campaign and we focused on Afaan Oromo, Hindi and Telugu as source (query) languages for retrieval of documents from English text collection. We have used a dictionary based approach for CLIR. After a brief description of our CLIR system we discuss the evaluation results of various experiments we conducted using CLEF 2006 dataset.

[1]  Vasudeva Varma,et al.  WebKhoj: Indian language IR from multiple character encodings , 2006, WWW '06.

[2]  Gregory Grefenstette,et al.  Cross-Language Information Retrieval , 1998, The Springer International Series on Information Retrieval.

[3]  Carol Peters,et al.  Multilingual Information Access for Text, Speech and Images, 5th Workshop of the Cross-Language Evaluation Forum, CLEF 2004, Bath, UK, September 15-17, 2004, Revised Selected Papers , 2005, CLEF.

[4]  Ralph Grishman,et al.  Hindi-english cross-lingual question-answering system , 2003, TALIP.

[5]  Atelach Alemu Argaw,et al.  Dictionary-based Amharic-French Information Retrieval , 2005, CLEF.

[6]  Richard M. Schwartz,et al.  Cross-language headline generation for Hindi , 2003, TALIP.

[7]  Ari Pirkola,et al.  Afrikaans-English cross-language information retrieval , 2004 .

[8]  Douglas W. Oard,et al.  Alternative Approaches for Cross-Language Text Retrieval , 1997 .

[9]  Douglas W. Oard,et al.  The surprise language exercises , 2003, TALIP.

[10]  Fredric C. Gey,et al.  Accessing Multilingual Information Repositories, 6th Workshop of the Cross-Language Evalution Forum, CLEF 2005, Vienna, Austria, 21-23 September, 2005, Revised Selected Papers , 2006, CLEF.

[11]  Lawrence Philips,et al.  The double metaphone search algorithm , 2000 .

[12]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[13]  Leah S. Larkey,et al.  Hindi CLIR in thirty days , 2003, TALIP.

[14]  Atelach Alemu Argaw,et al.  Dictionary-based Amharic - English Information Retrieval , 2004, CLEF.

[15]  Curt Burgess,et al.  Producing high-dimensional semantic spaces from lexical co-occurrence , 1996 .