Supporting Arabic Cross-Lingual Retrieval Using Contextual Information

One of the main problems that impact the performance of cross-language information retrieval (CLIR) systems is how to disambiguate translations and - since this usually can not be done completely automatic - how to smoothly integrate a user in this disambiguation process. In order to ensure that a user has a certain confidence in selecting a translation she/he possibly can not even read or understand, we have to make sure that the system has provided sufficient information about translation alternatives and their meaning. In this paper, we present a CLIR tool that automatically translates the user query and provides possibilities to interactively select relevant terms using contextual information. This information is obtained from a parallel corpus to describe the translation in the user's query language. Furthermore, a user study was conducted to identify weaknesses in both disambiguation algorithm and interface design. The outcome of this user study leads to a much clearer view of how and what CLIR should offer to users.

[1]  Peter Ingwersen,et al.  Developing a Test Collection for the Evaluation of Integrated Search , 2010, ECIR.

[2]  Piek Vossen,et al.  EuroWordNet: A multilingual database with lexical semantic networks , 1998, Springer Netherlands.

[3]  Jakob Nielsen,et al.  A mathematical model of the finding of usability problems , 1993, INTERCHI.

[4]  Minoru Sasaki,et al.  Unsupervised learning of word sense disambiguation rules by estimating an optimum iteration number in the EM algorithm , 2003, CoNLL.

[5]  Jakob Nielsen,et al.  Usability engineering , 1997, The Computer Science and Engineering Handbook.

[6]  Jianqiang Wang,et al.  User-assisted query translation for interactive cross-language information retrieval , 2008, Inf. Process. Manag..

[7]  John Tait,et al.  Literature Review of Cross Language Information Retrieval , 2005, WEC.

[8]  Daniela Petrelli,et al.  Observing users, designing clarity: A case study on the user-centered design of a cross-language information retrieval system , 2004, J. Assoc. Inf. Sci. Technol..

[9]  Gerard Salton,et al.  Experiments in Multi-Lingual Information Retrieval , 1972, Inf. Process. Lett..

[10]  Mark W. Davis,et al.  Improving cross-language text retrieval with human interactions , 2000, Proceedings of the 33rd Annual Hawaii International Conference on System Sciences.

[11]  Ahmed Abdelali,et al.  UCLIR: a Multilingual Information Retrieval Tool , 2004, Inteligencia Artif..

[12]  W. Bruce Croft,et al.  Dictionary Methods for Cross-Lingual Information Retrieval , 1996, DEXA.

[13]  Andreas Nürnberger,et al.  Evaluation of n-gram conflation approaches for Arabic text retrieval , 2009 .

[14]  Carol Peters,et al.  Cross-Language Information Retrieval (CLIR) Track Overview , 1997, TREC.

[15]  Wenfei Fan,et al.  Keys with Upward Wildcards for XML , 2001, DEXA.

[16]  Ruben Heradio,et al.  Automatic Word Sense Disambiguation Using Cooccurrence and Hierarchical Information , 2010, NLDB.

[17]  Ophir Frieder,et al.  Effective arabic-english cross-language information retrieval via machine-readable dictionaries and machine translation , 2001, CIKM '01.

[18]  Mark Stevenson,et al.  Cross-Language Information Retrieval Using EuroWordNet and Word Sense Disambiguation , 2004, ECIR.

[19]  Douglas W. Oard,et al.  Dictionary-based techniques for cross-language information retrieval , 2005, Inf. Process. Manag..

[20]  Alon Itai,et al.  Word Sense Disambiguation Using a Second Language Monolingual Corpus , 1994, CL.

[21]  Andreas Nürnberger,et al.  Arabic/English word translation disambiguation using parallel corpora and matching schemes , 2008, EAMT.