Evaluation of Axiomatic Approaches to Crosslanguage Retrieval

Integrating word sense disambiguation into an information retrieval system could potentially improve its performance. This is the major motivation for the Robust WSD tasks of the Ad-Hoc Track of the CLEF 2009 campaign. For these tasks we have build a customizable and flexible retrieval system. The best performing configuration of this system is based on research in the area of axiomatic approaches to information retrieval. Further, our experiments show that configurations that incorporate word sense disambiguation (WSD) information into the retrieval process did outperform those without. For the monolingual task the performance difference is more pronounced than for the bilingual task. Finally, we are able to show that our query translation approach does work effectively, even if applied in the monolingual task.

[1]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[2]  Eneko Agirre,et al.  UBC-ALM: Combining k-NN with SVD for WSD , 2007, SemEval@ACL.

[3]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[4]  Charles L. A. Clarke,et al.  Scoring missing terms in information retrieval tasks , 2004, CIKM '04.

[5]  James Allan,et al.  A comparison of statistical significance tests for information retrieval evaluation , 2007, CIKM '07.

[6]  Peter D. Turney Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL , 2001, ECML.

[7]  Kenneth Ward Church,et al.  A Program for Aligning Sentences in Bilingual Corpora , 1993, CL.

[8]  ChengXiang Zhai,et al.  An exploration of axiomatic approaches to information retrieval , 2005, SIGIR '05.

[9]  Carol Peters,et al.  Evaluating Systems for Multilingual and Multimodal Information Access, 9th Workshop of the Cross-Language Evaluation Forum, CLEF 2008, Aarhus, Denmark, September 17-19, 2008, Revised Selected Papers , 2009, CLEF.

[10]  Roman Kern,et al.  Exploiting Cooccurrence on Corpus and Document Level for Fair Crosslanguage Retrieval , 2008, CLEF.

[11]  Mark Sanderson,et al.  Word sense disambiguation and information retrieval , 1994, SIGIR '94.

[12]  Jimmy J. Lin,et al.  Quantitative evaluation of passage retrieval algorithms for question answering , 2003, SIGIR.

[13]  Hwee Tou Ng,et al.  NUS-PT: Exploiting Parallel Texts for Word Sense Disambiguation in the English All-Words Tasks , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[14]  Douglas W. Oard,et al.  A survey of multilingual text retrieval , 1996 .

[15]  Claire Fautsch,et al.  UniNE at CLEF 2008: TEL, Persian and Robust IR , 2008, CLEF.

[16]  Gilles Falquet,et al.  UNIGE Experiments on Robust Word Sense Disambiguation , 2008, CLEF.

[17]  Ellen M. Voorhees Natural Language Processing and Information Retrieval , 1999, SCIE.

[18]  Stephen E. Robertson,et al.  Okapi at TREC-3 , 1994, TREC.

[19]  Jian-Yun Nie,et al.  Learning to Rank Documents for Ad-Hoc Retrieval with Regularized Models , 2007 .

[20]  Wendy G. Lehnert,et al.  Information extraction , 1996, CACM.

[21]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[22]  Eneko Agirre,et al.  Proceedings of the 4th International Workshop on Semantic Evaluations , 2007 .

[23]  Claire Fautsch,et al.  UniNE at CLEF 2008: TEL, and Persian IR , 2008, CLEF.

[24]  Gary C. Borchardt,et al.  External Knowledge Sources for Question Answering , 2005, TREC.