Enhancing query translation with relevance feedback in translingual information retrieval

As an effective technique for improving retrieval effectiveness, relevance feedback (RF) has been widely studied in both monolingual and translingual information retrieval (TLIR). The studies of RF in TLIR have been focused on query expansion (QE), in which queries are reformulated before and/or after they are translated. However, RF in TLIR actually not only can help select better query terms, but also can enhance query translation by adjusting translation probabilities and even resolving some out-of-vocabulary terms. In this paper, we propose a novel relevance feedback method called translation enhancement (TE), which uses the extracted translation relationships from relevant documents to revise the translation probabilities of query terms and to identify extra available translation alternatives so that the translated queries are more tuned to the current search. We studied TE using pseudo-relevance feedback (PRF) and interactive relevance feedback (IRF). Our results show that TE can significantly improve TLIR with both types of relevance feedback methods, and that the improvement is comparable to that of query expansion. More importantly, the effects of translation enhancement and query expansion are complementary. Their integration can produce further improvement, and makes TLIR more robust for a variety of queries.

[1]  Ryen W. White,et al.  Supporting exploratory search , 2006 .

[2]  Thomas Mandl,et al.  The effect of named entities on effectiveness in cross-language information retrieval evaluation , 2005, SAC '05.

[3]  James Mayfield,et al.  Comparing cross-language query expansion techniques by degrading translation resources , 2002, SIGIR '02.

[4]  Christian R. Huyck,et al.  Relevance feedback and cross-language information retrieval , 2006, Inf. Process. Manag..

[5]  James Allan,et al.  HARD Track Overview in TREC 2003: High Accuracy Retrieval from Documents , 2003, TREC.

[6]  Fabio Crestani,et al.  Neural Relevance Feedback for Information Retrieval , 2000 .

[7]  Djoerd Hiemstra,et al.  Translation Resources, Merging Strategies, and Relevance Feedback for Cross-Language Information Retrieval , 2000, CLEF.

[8]  Jianqiang Wang,et al.  iCLEF 2001 at Maryland: Comparing Term-for-Term Gloss and MT , 2001, CLEF.

[9]  Daniel Jurafsky,et al.  A Conditional Random Field Word Segmenter for Sighan Bakeoff 2005 , 2005, IJCNLP.

[10]  Martin Dillon,et al.  The Use of Automatic Relevance feedback in Boolean Retrieval Systems , 1980, J. Documentation.

[11]  Charles W. Krueger,et al.  New methods in software product line practice , 2006, CACM.

[12]  Douglas W. Oard,et al.  Probabilistic structured query methods , 2003, SIGIR.

[13]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[14]  W. Bruce Croft,et al.  Combining the language model and inference network approaches to retrieval , 2004, Inf. Process. Manag..

[15]  Donna K. Harman,et al.  Relevance feedback revisited , 1992, SIGIR '92.

[16]  James Allan,et al.  Topic detection and tracking: event-based information organization , 2002 .

[17]  W. Bruce Croft,et al.  Cross-lingual relevance models , 2002, SIGIR '02.

[18]  Douglas W. Oard,et al.  The effect of bilingual term list size on dictionary-based cross-language information retrieval , 2003, 36th Annual Hawaii International Conference on System Sciences, 2003. Proceedings of the.

[19]  Jianqiang Wang,et al.  Combining bidirectional translation and synonymy for cross-language information retrieval , 2006, SIGIR.

[20]  Bart Verheij 20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE , 2007, IJCAI 2007.

[21]  Julio Gonzalo,et al.  Noun Phrase Translations for Cross-Language Document Selection , 2001, CLEF.

[22]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .

[23]  James Allan,et al.  Introduction to topic detection and tracking , 2002 .

[24]  James Allan,et al.  Automatic Query Expansion Using SMART: TREC 3 , 1994, TREC.

[25]  Douglas W. Oard,et al.  Improved Cross-Language Retrieval using Backoff Translation , 2001, HLT.

[26]  Jian-Yun Nie,et al.  Cross-language information retrieval based on parallel texts and automatic mining of parallel texts from the Web , 1999, SIGIR '99.

[27]  Tetsuya Ishikawa,et al.  Cross-Language Information Retrieval at ULIS , 1999, NTCIR.

[28]  Anton Leuski,et al.  Making MIRACLEs: Interactive translingual search for Cebuano and Hindi , 2003, TALIP.

[29]  Jianqiang Wang,et al.  Comparing User-assisted and Automatic Query Translation , 2002, CLEF.

[30]  Ellen M. Voorhees,et al.  Query expansion using lexical-semantic relations , 1994, SIGIR '94.

[31]  James Allan,et al.  Simultaneous multilingual search for translingual information retrieval , 2008, CIKM '08.

[32]  James Allan,et al.  Topic Detection and Tracking , 2002, The Information Retrieval Series.

[33]  Heng Ji,et al.  NYU-Fair Issac-RWTH Chinese to English entity translation 07 system , 2007 .

[34]  Gerard Salton,et al.  Improving retrieval performance by relevance feedback , 1997, J. Am. Soc. Inf. Sci..

[35]  Julio Gonzalo,et al.  Interactive Cross-Language Document Selection , 2004, Information Retrieval.

[36]  Heng Ji,et al.  The effects of high quality translations of named entities in cross-language information exploration , 2008, 2008 International Conference on Natural Language Processing and Knowledge Engineering.

[37]  W. Bruce Croft,et al.  Phrasal translation and query expansion techniques for cross-language information retrieval , 1997, SIGIR '97.

[38]  Gerard Salton,et al.  The SMART Retrieval System , 1971 .

[39]  Julio Gonzalo,et al.  Noun phrases as building blocks for cross-language Search Assistance , 2005, Inf. Process. Manag..

[40]  Gerard Salton,et al.  Optimization of relevance feedback weights , 1995, SIGIR '95.

[41]  Yiming Yang,et al.  Translingual Information Retrieval: A Comparative Evaluation , 1997, IJCAI.

[42]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[43]  David Yarowsky,et al.  Unsupervised Word Sense Disambiguation Rivaling Supervised Methods , 1995, ACL.

[44]  W. Bruce Croft,et al.  Improving the effectiveness of information retrieval with local context analysis , 2000, TOIS.

[45]  Donald H. Kraft,et al.  New methods for relevance feedback: improving information retrieval performance , 1995, SAC '95.

[46]  Jay Ponte,et al.  LANGUAGE MODELS FOR RELEVANCE FEEDBACK , 2002 .

[47]  Dan Wu,et al.  Ice-tea: an interactive cross-language search engine with translation enhancement , 2008, SIGIR '08.

[48]  Mark Liberman,et al.  Corpora for topic detection and tracking , 2002 .

[49]  W. Bruce Croft,et al.  Relevance-Based Language Models , 2001, SIGIR '01.

[50]  W. Bruce Croft Advances in Informational Retrieval: Recent Research from the Center for Intelligent Information Retrieval , 2000 .

[51]  Gerard Salton,et al.  The SMART Retrieval System—Experiments in Automatic Document Processing , 1971 .

[52]  Alexander M. Fraser,et al.  TREC 2001 Cross-lingual Retrieval at BBN , 2001, TREC.

[53]  Jianqiang Wang,et al.  User-assisted query translation for interactive cross-language information retrieval , 2008, Inf. Process. Manag..

[54]  W. Bruce Croft,et al.  Resolving ambiguity for cross-language retrieval , 1998, SIGIR '98.

[55]  Yi Liu,et al.  A maximum coherence model for dictionary-based cross-language information retrieval , 2005, SIGIR '05.

[56]  Douglas W. Oard The CLEF 2001 Interactive Track , 2001, CLEF.

[57]  Gerard Salton,et al.  Improving Retrieval Performance by Relevance Feedback , 1997 .

[58]  Ari Pirkola,et al.  The effects of query structure and dictionary setups in dictionary-based cross-language information retrieval , 1998, SIGIR '98.