NTCIR-6 Experiments using Pattern Matched Translation Extraction

This paper describes our experiment methods and results in the Sixth NTCIR Workshop Meeting on Evaluation of Information Access Technologies. We introduce a Pattern Matched Translation Extraction (PMTE) approach to the analysis of mixed-languages web pages, which makes use of pattern matching to automatically extract the translation pairs. The experiment results demonstrated the proposed method is effective when translating Out-of-Vocabulary (OOV) terms, a wellknown problem in fields of cross-language information retrieval (CLIR), question-answering (QA), machine translation (MT) and knowledge discovery (KD). We also report the experiment results of single-language information retrieval (SLIR) and illustrate the performance through different collections in STAGE 2 of NTCIR-6.

[1]  Jian-Yun Nie,et al.  Parallel Web text mining for cross-language IR , 2000, RIAO.

[2]  Ying Zhang,et al.  Chinese OOV translation and post-translation query expansion in chinese--english cross-lingual information retrieval , 2005, TALIP.

[3]  Pu-Jen Cheng,et al.  Translating unknown cross-lingual queries in digital libraries using a Web-based approach , 2004, Proceedings of the 2004 Joint ACM/IEEE Conference on Digital Libraries, 2004..

[4]  W. Bruce Croft,et al.  Resolving ambiguity for cross-language retrieval , 1998, SIGIR '98.

[5]  Yi Liu,et al.  A maximum coherence model for dictionary-based cross-language information retrieval , 2005, SIGIR '05.

[6]  Hsin-Hsi Chen,et al.  Overview of CLIR Task at the Fourth NTCIR Workshop , 2004, NTCIR.

[7]  Jian-Yun Nie,et al.  Cross-language information retrieval based on parallel texts and automatic mining of parallel texts from the Web , 1999, SIGIR '99.

[8]  Pu-Jen Cheng,et al.  Translating unknown queries with web corpora for cross-language information retrieval , 2004, SIGIR '04.

[9]  Ari Pirkola,et al.  The effects of query structure and dictionary setups in dictionary-based cross-language information retrieval , 1998, SIGIR '98.

[10]  Hsin-Hsi Chen,et al.  Overview of CLIR Task at the Sixth NTCIR Workshop , 2005, NTCIR.

[11]  Hsi-Jian Lee,et al.  Translation of web queries using anchor text mining , 2002, TALIP.

[12]  Ying Zhang,et al.  Using the web for automated translation extraction in cross-language information retrieval , 2004, SIGIR '04.